1. bookVolume 2019 (2019): Issue 3 (July 2019)
Zeitschriftendaten
License
Format
Zeitschrift
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
access type Open Access

MAPS: Scaling Privacy Compliance Analysis to a Million Apps

Online veröffentlicht: 12 Jul 2019
Seitenbereich: 66 - 86
Zeitschriftendaten
License
Format
Zeitschrift
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch

The app economy is largely reliant on data collection as its primary revenue model. To comply with legal requirements, app developers are often obligated to notify users of their privacy practices in privacy policies. However, prior research has suggested that many developers are not accurately disclosing their apps’ privacy practices. Evaluating discrepancies between apps’ code and privacy policies enables the identification of potential compliance issues. In this study, we introduce the Mobile App Privacy System (MAPS) for conducting an extensive privacy census of Android apps. We designed a pipeline for retrieving and analyzing large app populations based on code analysis and machine learning techniques. In its first application, we conduct a privacy evaluation for a set of 1,035,853 Android apps from the Google Play Store. We find broad evidence of potential non-compliance. Many apps do not have a privacy policy to begin with. Policies that do exist are often silent on the practices performed by apps. For example, 12.1% of apps have at least one location-related potential compliance issue. We hope that our extensive analysis will motivate app stores, government regulators, and app developers to more effectively review apps for potential compliance issues.

[1] V. Afonso, A. Bianchi, Y. Fratantonio, A. Doupe, M. Polino, P. de Geus, C. Kruegel, and G. Vigna, “Going native: Using a large-scale analysis of android apps to create a practical native-code sandboxing policy,” in NDSS ’16, Feb. 2016.Search in Google Scholar

[2] S. Arzt, S. Rasthofer, C. Fritz, E. Bodden, A. Bartel, J. Klein, Y. Le Traon, D. Octeau, and P. McDaniel, “Flow-Droid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps,” SIGPLAN Not., vol. 49, no. 6, pp. 259–269, Jun. 2014.Search in Google Scholar

[3] R. Balebako, A. Marsh, J. Lin, J. Hong, and L. F. Cranor, “The privacy and security behaviors of smartphone app developers,” in USEC ’14, 2014.Search in Google Scholar

[4] S. Bird, E. Klein, and E. Loper, “Natural language processing with python,” 2014, accessed: June 28, 2019. [Online]. Available: http://www.nltk.org/book/ch11.htmlSearch in Google Scholar

[5] J. Bowers, B. Reaves, I. N. Sherman, P. Traynor, and K. R. B. Butler, “Regulators, mount up! Analysis of privacy policies for mobile money services,” in SOUPS ’17, 2017.Search in Google Scholar

[6] California Department of Justice, “Attorney General Kamala D. Harris secures global agreement to strengthen privacy protections for users of mobile applications,” http://www.oag.ca.gov/news/press-releases/attorney-general-kamala-d-harris-secures-global-agreement-strengthen-privacy, Feb. 2012, accessed: June 28, 2019.Search in Google Scholar

[7] Y. Chen, W. You, Y. Lee, K. Chen, X. Wang, and W. Zou, “Mass discovery of android traffic imprints through instantiated partial execution,” in CCS ’17, 2017.Search in Google Scholar

[8] B. Clark. (2017, Feb.) Millions of apps could soon be purged from Google Play Store. https://thenextweb.com/google/2017/02/08/millions-apps-soon-purged-google-play-store/.Search in Google Scholar

[9] A. Continella, Y. Fratantonio, M. Lindorfer, A. Puccetti, A. Zand, C. Kruegel, and G. Vigna, “Obfuscation-resilient privacy leak detection for mobile apps through differential analysis,” in NDSS ’17, 2017.Search in Google Scholar

[10] L. F. Cranor, P. G. Leon, and B. Ur, “A large-scale evaluation of U.S. financial institutions standardized privacy notices,” ACM Trans. Web, vol. 10, no. 3, pp. 17:1–17:33, Aug. 2016.Search in Google Scholar

[11] Don Reisinger, “Google Play gets serious with ’expert’ screening, age ratings for Android apps,” https://www.cnet.com/news/google-play-adds-app-ratings-to-inform-users-on-content/, Mar. 2015, accessed: June 28, 2019.Search in Google Scholar

[12] B. Efron, “Bootstrap methods: Another look at the jackknife,” in Breakthroughs in statistics. Springer, 1992, pp. 569–593.Search in Google Scholar

[13] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth, “TaintDroid: An information-flow tracking system for realtime privacy monitoring on smartphones,” in OSDI ’10, 2010.Search in Google Scholar

[14] T. Ermakova, B. Fabian, and E. Babina, “Readability of privacy policies of healthcare websites,” in Wirtschaftsinformatik ’15, 2015.Search in Google Scholar

[15] ESRB, “ESRB ratings guide,” http://www.esrb.org/ratings/ratings_guide.aspx, 2018, accessed: June 28, 2019.Search in Google Scholar

[16] FTC, “Complaint Path,” https://www.ftc.gov/sites/default/files/documents/cases/2013/02/130201pathinccmpt.pdf, Feb. 2013, accessed: June 28, 2019.Search in Google Scholar

[17] C. Gibler, J. Crussell, J. Erickson, and H. Chen, “AndroidLeaks: Automatically detecting potential privacy leaks in android applications on a large scale,” in TRUST ’12, 2012.Search in Google Scholar

[18] Google, “Designed for families addendum,” https://play.google.com/intl/ALL_us/about/families/developer-distribution-agreement-addendum.html, 2015, accessed: June 28, 2019.Search in Google Scholar

[19] Google, “Google analytics terms of service,” https://www.google.com/analytics/terms/us.html, 2018, accessed: June 28, 2019.Search in Google Scholar

[20] ——, “Google developer policy center user data,” https://play.google.com/about/privacy-security-deception/user-data/, 2018, accessed: June 28, 2019.Search in Google Scholar

[21] Google, “Play console help,” https://support.google.com/googleplay/android-developer/answer/6048248?hl=en, 2018, accessed: June 28, 2019.Search in Google Scholar

[22] M. I. Gordon, D. Kim, J. Perkins, L. Gilham, N. Nguyen, and M. Rinard, “Information-flow analysis of android applications in DroidSafe,” in NDSS ’15, 2015.Search in Google Scholar

[23] H. Harkous, K. Fawaz, R. Lebret, F. Schaub, K. G. Shin, and K. Aberer, “Polisis: Automated analysis and presentation of privacy policies using deep learning,” in USENIX Security ’18, 2018.Search in Google Scholar

[24] J. Huang, O. Schranz, S. Bugiel, and M. Backes, “The art of app compartmentalization: Compiler-based library privilege separation on stock android,” in CCS ’17, 2017.Search in Google Scholar

[25] L. Lei, Y. He, K. Sun, J. Jing, Y. Wang, Q. Li, and J. Weng, “Vulnerable implicit service: A revisit,” in CCS ’17, 2017.Search in Google Scholar

[26] T. Libert, “An automated approach to auditing disclosure of third-party data collection in website privacy policies,” in WWW ’18, 2018.Search in Google Scholar

[27] J. Lin, B. Liu, N. Sadeh, and J. I. Hong, “Modeling users’ mobile app privacy preferences: Restoring usability in a sea of permission settings,” in SOUPS ’14. USENIX Assoc., 2014.Search in Google Scholar

[28] B. Liu, B. Liu, H. Jin, and R. Govindan, “Efficient privilege de-escalation for ad libraries in mobile apps,” in MobiSys ’15, 2015.Search in Google Scholar

[29] F. Liu, S. Wilson, P. Story, S. Zimmeck, and N. Sadeh, “Towards automatic classification of privacy policy text,” School of Computer Science Carnegie Mellon University, Pittsburgh, PA, Tech. Rep. CMU-ISR-17-118R and CMULTI-17-010, Jun. 2018.Search in Google Scholar

[30] C. D. Manning, P. Raghavan, and H. Schütze, Introduction to information retrieval. Cambridge University Press, 2008.Search in Google Scholar

[31] E. Mariconti, L. Onwuzurike, P. Andriotis, E. D. Cristofaro, G. J. Ross, and G. Stringhini, “Mamadroid: Detecting android malware by building markov chains of behavioral models,” in NDSS ’17, 2017.Search in Google Scholar

[32] F. Marotta-Wurgler, “Does “notice and choice” disclosure regulation work? An empirical study of privacy policies,” https://www.law.umich.edu/centersandprograms/lawandeconomics/workshops/Documents/Paper13.Marotta-Wurgler.Does%20Notice%20and%20Choice%20Disclosure%20Work.pdf, 2015, accessed: June 28, 2019.Search in Google Scholar

[33] A. M. McDonald and L. F. Cranor, “The cost of reading privacy policies,” I/S: A Journal of Law and Policy for the Information Society, vol. 4, no. 3, pp. 540–565, 2008.Search in Google Scholar

[34] P. Mutchler, A. Doupé, J. Mitchell, C. Kruegel, and G. Vigna, “A large-scale study of mobile web app security,” in MoST ’15, 2015.Search in Google Scholar

[35] Y. Nan, Z. Yang, X. Wang, Y. Zhang, D. Zhu, and M. Yang, “Finding clues for your secrets: Semantics-driven, learning-based privacy discovery in mobile apps,” in NDSS ’17, 2017.Search in Google Scholar

[36] R. Neisse, G. Steri, D. Geneiatakis, and I. N. Fovino, “A privacy enforcing framework for android applications,” Computers & Security, vol. 62, pp. 257 – 277, 2016.Search in Google Scholar

[37] Oracle, “Naming a package,” https://docs.oracle.com/javase/tutorial/java/package/namingpkgs.html, 2017, accessed: June 28, 2019.Search in Google Scholar

[38] X. Pan, X. Wang, Y. Duan, X. Wang, and H. Yin, “Dark hazard: Learning-based, large-scale discovery of hidden sensitive operations in android apps,” in NDSS ’17, 2017.Search in Google Scholar

[39] R. Ramanath, F. Liu, N. Sadeh, and N. A. Smith, “Unsupervised alignment of privacy policies using hidden markov models,” in ACL ’14, 2014.Search in Google Scholar

[40] A. Razaghpanah, R. Nithyanand, N. Vallina-Rodriguez, S. Sundaresan, M. Allman, C. Kreibich, and P. Gill, “Apps, trackers, privacy and regulators: A global study of the mobile tracking ecosystem,” in NDSS ’18, 2018.Search in Google Scholar

[41] A. Razaghpanah, N. Vallina-Rodriguez, S. Sundaresan, C. Kreibich, P. Gill, M. Allman, and V. Paxson, “Haystack: In situ mobile traffic analysis in user space,” CoRR, vol. abs/1510.01419, 2015.Search in Google Scholar

[42] D. Reidsma and J. Carletta, “Reliability measurement without limits,” Comput. Linguist., vol. 34, no. 3, pp. 319–326, Sep. 2008.Search in Google Scholar

[43] J. Ren, M. Lindorfer, D. Dubois, A. Rao, D. Choffnes, and N. Vallina-Rodriguez, “Bug fixes, improvements, ... and privacy leaks – a longitudinal study of PII leaks across android app versions,” in NDSS ’18, 2018.Search in Google Scholar

[44] J. Ren, A. Rao, M. Lindorfer, A. Legout, and D. Choffnes, “Recon: Revealing and controlling PII leaks in mobile network traffic,” in MobiSys ’16, 2016.Search in Google Scholar

[45] I. Reyes, P. Wijesekera, J. Reardon, A. E. B. On, A. Razaghpanah, N. Vallina-Rodriguez, and S. Egelman, ““Won’t somebody think of the children?" Examining COPPA compliance at scale,” in PETS ’18, vol. 3, 2018, pp. 63–83.Search in Google Scholar

[46] N. Sadeh, A. Acquisti, T. D. Breaux, L. F. Cranor, A. M. McDonald, J. R. Reidenberg, N. A. Smith, F. Liu, N. C. Russell, F. Schaub, and S. Wilson, “The usable privacy policy project,” Carnegie Mellon University, Tech. report CMU-ISR-13-119, 2013.Search in Google Scholar

[47] K. M. Sathyendra, S. Wilson, F. Schaub, S. Zimmeck, and N. Sadeh, “Identifying the provision of choices in privacy policy text,” in EMNLP ’17, 2017.Search in Google Scholar

[48] scikit-learn developers, “sklearn.feature_extraction.text.tfidfvectorizer,” http://scikit-learn.org/0.18/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html, 2016, accessed: June 28, 2019.Search in Google Scholar

[49] ——, “sklearn.linear_model.logisticregression,” http://scikit-learn.org/0.18/modules/generated/sklearn.linear_model.LogisticRegression.html, 2016, accessed: June 28, 2019.Search in Google Scholar

[50] ——, “sklearn.svm.svc,” http://scikit-learn.org/0.18/modules/generated/sklearn.svm.SVC.html, 2016, accessed: June 28, 2019.Search in Google Scholar

[51] R. Slavin, X. Wang, M. Hosseini, W. Hester, R. Krishnan, J. Bhatia, T. Breaux, and J. Niu, “Toward a framework for detecting privacy policy violation in android application code,” in ICSE ’16, 2016.Search in Google Scholar

[52] D. J. Solove and W. Hartzog, “The FTC and the new common law of privacy,” Columbia Law Review, vol. 114, pp. 583–676, 2014.Search in Google Scholar

[53] P. Story, S. Zimmeck, A. Ravichander, D. Smullen, Z. Wang, J. Reidenberg, N. C. Russell, and N. Sadeh, “Natural language processing for mobile app privacy compliance,” AAAI Spring Symposium on Privacy-Enhancing Artificial Intelligence and Language Technologies, Mar. 2019.Search in Google Scholar

[54] P. Story, S. Zimmeck, and N. Sadeh, “Which apps have privacy policies?” in APF ’18, 2018.Search in Google Scholar

[55] W. B. Tesfay, P. Hofmann, T. Nakamura, S. Kiyomoto, and J. Serna, “I read but don’t agree: Privacy policy benchmarking using machine learning and the EU GDPR,” in WWW ’18, 2018.Search in Google Scholar

[56] G. Tottie, Negation in English speech and writing. Academic Press, 1991.Search in Google Scholar

[57] J. Towns, T. Cockerill, M. Dahan, I. Foster, K. Gaither, A. Grimshaw, V. Hazlewood, S. Lathrop, D. Lifka, G. D. Peterson, R. Roskies, J. R. Scott, and N. Wilkins-Diehr, “XSEDE: Accelerating scientific discovery,” Computing in Science & Engineering, vol. 16, no. 5, pp. 62–74, Sep. 2014.Search in Google Scholar

[58] G. S. Tuncay, S. Demetriou, K. Ganju, and C. A. Gunter, “Resolving the predicament of android custom permissions,” in NDSS ’18, 2018.Search in Google Scholar

[59] N. Viennot, E. Garcia, and J. Nieh, “A measurement study of Google Play,” in SIGMETRICS ’14, 2014.Search in Google Scholar

[60] H. Wang, Z. Liu, Y. Guo, X. Chen, M. Zhang, G. Xu, and J. Hong, “An explorative study of the mobile app ecosystem from app developers’ perspective,” in WWW ’17, 2017.Search in Google Scholar

[61] X. Wang, X. Qin, M. B. Hosseini, R. Slavin, T. D. Breaux, and J. Niu, “GUILeak: Identifying privacy practices on GUI-based data,” https://pdfs.semanticscholar.org/ced1/313acaacd3897b5b231cdccb1383d01d20c4.pdf, 2017, accessed: June 28, 2019.Search in Google Scholar

[62] T. Watanabe, M. Akiyama, T. Sakai, and T. Mori, “Understanding the inconsistencies between text descriptions and the use of privacy-sensitive resources of mobile apps,” in SOUPS ’15, 2015.Search in Google Scholar

[63] S. Wilson, F. Schaub, A. A. Dara, F. Liu, S. Cherivirala, P. G. Leon, M. S. Andersen, S. Zimmeck, K. M. Sathyendra, N. C. Russell, T. B. Norton, E. Hovy, J. Reidenberg, and N. Sadeh, “The creation and analysis of a website privacy policy corpus,” in ACL ’16, 2016.Search in Google Scholar

[64] L. Yu, X. Luo, X. Liu, and T. Zhang, “Can we trust the privacy policies of android apps?” in DSN ’16, 2016.Search in Google Scholar

[65] Y. Zhuang, A. Rafetseder, Y. Hu, Y. Tian, and J. Cappos, “Sensibility Testbed: Automated IRB policy enforcement in mobile research apps,” in HotMobile ’18, 2018.Search in Google Scholar

[66] S. Zimmeck and S. M. Bellovin, “Privee: An architecture for automatically analyzing web privacy policies,” in USENIX Security ’14, 2014.Search in Google Scholar

[67] S. Zimmeck, Z. Wang, L. Zou, R. Iyengar, B. Liu, F. Schaub, S. Wilson, N. Sadeh, S. M. Bellovin, and J. Reidenberg, “Automated analysis of privacy requirements for mobile apps,” in NDSS ’17, 2017.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo