Evaluating Taint Specification Generators for Identifying Taint Sources in Relation to Data Safety Section
Security and privacy researchers uncover privacy compliance issues in Android apps by utilizing taint analysis, which necessitates specific lists of sensitive source and sink APIs (i.e., taint specification). Since manually crafting a comprehensive and updated taint specification across the tens of thousands of APIs in the Android framework is impractical, automatic taint specification generators have been developed. Recently, two novel approaches, CoDoC and DocFlow, have emerged. On the other hand, Google introduced the Google Play Data Safety Section (DSS), where app developers disclose their apps’ privacy practices. Since taint sources vary by data category, DSS-related taint sources (DSSTSs) are essential for researchers analyzing DSS compliance. Currently, no studies on automatic DSSTS identification or relevant evaluation datasets exist. In this paper, towards automatic DSSTS identification, we evaluate CoDoC and DocFlow in identifying DSSTSs. We collect taint sources referenced in prior privacy-related studies and official documentation and map them to 11 DSS data categories to create a dataset of 505 APIs. Using this dataset, we evaluate CoDoC and DocFlow, finding that they perform well in certain data categories, suggesting suitability for category-specific investigation. Additionally, we apply CoDoC and DocFlow to classify the entire Android framework, revealing that both identify a substantial number of taint sources, ranging from 21% to 57%. We also show that executing a taint analyzer with the generated taint source lists produces substantial leak detections, imposing high verification costs. Our findings highlight the limitations of existing generators in the identification of DSSTSs. We release the dataset to the community.
slides (inayoshi_mobilesoft2025_slides.pdf) | 1.29MiB |
paper (inayoshi_mobilesoft2025.pdf) | 166KiB |
Sun 27 AprDisplayed time zone: Eastern Time (US & Canada) change
14:00 - 15:30 | |||
14:00 30mTalk | Talk 2: On the Generation of Meaningful Test Cases for Mobile Apps Research Track Leonardo Mariani University of Milano-Bicocca | ||
14:30 15mResearch paper | Can you mimic me? Exploring the Use of Android Record & Replay Tools in Debugging Research Track Zihe Song University of Texas at Dallas, S M Hasan Mansur George Mason University, Ravishka Shemal Rathnasuriya University of Texas at Dallas, Yumna Fatima George Mason University, Wei Yang UT Dallas, Kevin Moran University of Central Florida, Wing Lam George Mason University | ||
14:45 15mResearch paper | Evaluating Taint Specification Generators for Identifying Taint Sources in Relation to Data Safety Section Research Track Hiroki Inayoshi Okayama University, Shoichi Saito Nagoya Institute of Technology, Akito Monden Okayama University File Attached | ||
15:00 15mResearch paper | On-Device Mobile Application Testing Research Track Inte Vleminckx University of Antwerp, Elif Parlak University of Antwerp, Onur Kilincceker University of Antwerp and Flanders Make vzw, Serge Demeyer University of Antwerp and Flanders Make vzw | ||
15:15 15mTalk | Talk 3: Conquering iOS PREQ Hurdles Research Track Arth Patel Meta Inc |