1. bookVolumen 2019 (2019): Heft 3 (July 2019)
Zeitschriftendaten
License
Format
Zeitschrift
eISSN
2299-0984
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
access type Uneingeschränkter Zugang

Investigating Statistical Privacy Frameworks from the Perspective of Hypothesis Testing

Online veröffentlicht: 12 Jul 2019
Seitenbereich: 233 - 254
Eingereicht: 30 Nov 2018
Akzeptiert: 16 Mar 2019
Zeitschriftendaten
License
Format
Zeitschrift
eISSN
2299-0984
Erstveröffentlichung
16 Apr 2015
Erscheinungsweise
4 Hefte pro Jahr
Sprachen
Englisch
Abstract

Over the last decade, differential privacy (DP) has emerged as the gold standard of a rigorous and provable privacy framework. However, there are very few practical guidelines on how to apply differential privacy in practice, and a key challenge is how to set an appropriate value for the privacy parameter ɛ. In this work, we employ a statistical tool called hypothesis testing for discovering useful and interpretable guidelines for the state-of-the-art privacy-preserving frameworks. We formalize and implement hypothesis testing in terms of an adversary’s capability to infer mutually exclusive sensitive information about the input data (such as whether an individual has participated or not) from the output of the privacy-preserving mechanism. We quantify the success of the hypothesis testing using the precision- recall-relation, which provides an interpretable and natural guideline for practitioners and researchers on selecting ɛ. Our key results include a quantitative analysis of how hypothesis testing can guide the choice of the privacy parameter ɛ in an interpretable manner for a differentially private mechanism and its variants. Importantly, our findings show that an adversary’s auxiliary information - in the form of prior distribution of the database and correlation across records and time - indeed influences the proper choice of ɛ. Finally, we also show how the perspective of hypothesis testing can provide useful insights on the relationships among a broad range of privacy frameworks including differential privacy, Pufferfish privacy, Blowfish privacy, dependent differential privacy, inferential privacy, membership privacy and mutual-information based differential privacy.

[1] Detection, decision, and hypothesis testing. http://web.mit.edu/gallager/www/papers/chap3.pdf.Search in Google Scholar

[2] David R Anderson, Kenneth P Burnham, and William L Thompson. Null hypothesis testing: problems, prevalence, and an alternative. The journal of wildlife management, pages 912–923, 2000.10.2307/3803199Search in Google Scholar

[3] Miguel E Andrés, Nicolás E Bordenabe, Konstantinos Chatzikokolakis, and Catuscia Palamidessi. Geoindistinguishability: Differential privacy for location-based systems. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 901–914. ACM, 2013.10.1145/2508859.2516735Search in Google Scholar

[4] Borja Balle and Yu-Xiang Wang. Improving the gaussian mechanism for differential privacy: Analytical calibration and optimal denoising. In International Conference on Machine Learning (ICML), 2018.Search in Google Scholar

[5] Vincent Bindschaedler, Reza Shokri, and Carl A Gunter. Plausible deniability for privacy-preserving data synthesis. Proceedings of the VLDB Endowment, 10(5):481–492, 2017.10.14778/3055540.3055542Search in Google Scholar

[6] Yang Cao, Masatoshi Yoshikawa, Yonghui Xiao, and Li Xiong. Quantifying differential privacy under temporal correlations. In Data Engineering (ICDE), 2017 IEEE 33rd International Conference on, pages 821–832. IEEE, 2017.10.1109/ICDE.2017.132Search in Google Scholar

[7] Thee Chanyaswad, Alex Dytso, H Vincent Poor, and Prateek Mittal. Mvg mechanism: Differential privacy under matrixvalued query. In Proceedings of the 25nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 2018.10.1145/3243734.3243750Search in Google Scholar

[8] Rui Chen, Benjamin C Fung, Philip S Yu, and Bipin C Desai. Correlated network data publication via differential privacy. volume 23, pages 653–676. Springer-Verlag New York, Inc., 2014.10.1007/s00778-013-0344-8Search in Google Scholar

[9] Thomas M Cover and Joy A Thomas. Elements of information theory. John Wiley & Sons, 2012.Search in Google Scholar

[10] Paul Cuff and Lanqing Yu. Differential privacy as a mutual information constraint. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 43–54. ACM, 2016.10.1145/2976749.2978308Search in Google Scholar

[11] Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. In Proceedings of the 23rd international conference on Machine learning, pages 233–240. ACM, 2006.10.1145/1143844.1143874Search in Google Scholar

[12] Zeyu Ding, Yuxin Wang, Guanhong Wang, Danfeng Zhang, and Daniel Kifer. Detecting violations of differential privacy. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 475–489. ACM, 2018.10.1145/3243734.3243818Search in Google Scholar

[13] Cynthia Dwork. Differential privacy. In Automata, languages and programming. 2006.10.1007/11787006_1Search in Google Scholar

[14] Cynthia Dwork. Differential privacy: A survey of results. In Theory and Applications of Models of Computation. 2008.Search in Google Scholar

[15] Cynthia Dwork. A firm foundation for private data analysis. Communications of the ACM, 2011.10.1145/1866739.1866758Search in Google Scholar

[16] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Annual International Conference on the Theory and Applications of Cryptographic Techniques, pages 486–503. Springer, 2006.10.1007/11761679_29Search in Google Scholar

[17] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Springer Theory of cryptography. 2006.10.1007/11681878_14Search in Google Scholar

[18] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3–4):211–407, 2014.10.1561/0400000042Search in Google Scholar

[19] Cynthia Dwork and Guy N Rothblum. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016.Search in Google Scholar

[20] Cynthia Dwork and Adam Smith. Differential privacy for statistics: What we know and what we want to learn. Journal of Privacy and Confidentiality, 2010.10.29012/jpc.v1i2.570Search in Google Scholar

[21] Marco Gaboardi, Hyun-Woo Lim, Ryan M Rogers, and Salil P Vadhan. Differentially private chi-squared hypothesis testing: Goodness of fit and independence testing. In ICML’16 Proceedings of the 33rd International Conference on International Conference on Machine Learning-Volume 48. JMLR, 2016.Search in Google Scholar

[22] Srivatsava Ranjit Ganta, Shiva Prasad Kasiviswanathan, and Adam Smith. Composition attacks and auxiliary information in data privacy. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 265–273. ACM, 2008.Search in Google Scholar

[23] Quan Geng, Wei Ding, Ruiqi Guo, and Sanjiv Kumar. Optimal Noise-Adding Mechanism in Additive Differential Privacy. In Proceedings of the 22th International Conference on Artificial Intelligence and Statistics (AISTATS), 2019.Search in Google Scholar

[24] Arpita Ghosh and Robert Kleinberg. Inferential privacy guarantees for differentially private mechanisms. arXiv preprint arXiv:1603.01508, 2016.Search in Google Scholar

[25] Dorothy M Greig, Bruce T Porteous, and Allan H Seheult. Exact maximum a posteriori estimation for binary images. Journal of the Royal Statistical Society. Series B (Methodological), pages 271–279, 1989.10.1111/j.2517-6161.1989.tb01764.xSearch in Google Scholar

[26] Andreas Haeberlen, Benjamin C Pierce, and Arjun Narayan. Differential privacy under fire. In USENIX Security Symposium, 2011.Search in Google Scholar

[27] Rob Hall, Alessandro Rinaldo, and Larry Wasserman. Differential privacy for functions and functional data. Journal of Machine Learning Research, 14(Feb):703–727, 2013.Search in Google Scholar

[28] Xi He, Ashwin Machanavajjhala, and Bolin Ding. Blowfish privacy: Tuning privacy-utility trade-offs using policies. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data, pages 1447–1458. ACM, 2014.10.1145/2588555.2588581Search in Google Scholar

[29] Justin Hsu, Marco Gaboardi, Andreas Haeberlen, Sanjeev Khanna, Arjun Narayan, Benjamin C Pierce, and Aaron Roth. Differential privacy: An economic method for choosing epsilon. In Computer Security Foundations Symposium (CSF), 2014 IEEE 27th, pages 398–410. IEEE, 2014.Search in Google Scholar

[30] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential privacy. IEEE Transactions on Information Theory, 63(6):4037–4049, 2017.10.1109/TIT.2017.2685505Search in Google Scholar

[31] Shiva P Kasiviswanathan and Adam Smith. On the’semantics’ of differential privacy: A bayesian formulation. Journal of Privacy and Confidentiality, 6(1), 2014.10.29012/jpc.v6i1.634Search in Google Scholar

[32] Daniel Kifer and Ashwin Machanavajjhala. No free lunch in data privacy. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pages 193–204. ACM, 2011.10.1145/1989323.1989345Search in Google Scholar

[33] Daniel Kifer and Ashwin Machanavajjhala. A rigorous and customizable framework for privacy. In Proceedings of the 31st ACM SIGMOD-SIGACT-SIGAI symposium on Principles of Database Systems, pages 77–88. ACM, 2012.10.1145/2213556.2213571Search in Google Scholar

[34] Sara Krehbiel. Markets for database privacy. 2014.Search in Google Scholar

[35] Jaewoo Lee and Chris Clifton. How much is enough? choosing ε for differential privacy. In International Conference on Information Security, pages 325–340. Springer, 2011.10.1007/978-3-642-24861-0_22Search in Google Scholar

[36] Jaewoo Lee and Chris Clifton. Differential identifiability. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 1041–1049. ACM, 2012.Search in Google Scholar

[37] Erich L Lehmann and Joseph P Romano. Testing statistical hypotheses. Springer Science & Business Media, 2006.Search in Google Scholar

[38] Ninghui Li, Wahbeh Qardaji, Dong Su, Yi Wu, and Weining Yang. Membership privacy: a unifying framework for privacy definitions. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security, pages 889–900. ACM, 2013.Search in Google Scholar

[39] Changchang Liu, Supriyo Chakraborty, and Prateek Mittal. Dependence makes you vulnerable: Differential privacy under dependent tuples. In The Network and Distributed System Security Symposium (NDSS), 2016.Search in Google Scholar

[40] Ashwin Machanavajjhala, Xi He, and Michael Hay. Differential privacy in the wild: A tutorial on current practices & open challenges. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1727–1730. ACM, 2017.10.1145/3035918.3054779Search in Google Scholar

[41] Frank D McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data, pages 19–30. ACM, 2009.10.1145/1559845.1559850Search in Google Scholar

[42] Sebastian Meiser and Esfandiar Mohammadi. Tight on budget?: Tight bounds for r-fold approximate differential privacy. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 247–264. ACM, 2018.10.1145/3243734.3243765Search in Google Scholar

[43] Deepak K Merchant and George L Nemhauser. Optimality conditions for a dynamic traffic assignment model. Transportation Science, 12(3):200–207, 1978.10.1287/trsc.12.3.200Search in Google Scholar

[44] Ilya Mironov. Renyi differential privacy. In Computer Security Foundations Symposium (CSF), 2017 IEEE 30th, pages 263–275. IEEE, 2017.10.1109/CSF.2017.11Search in Google Scholar

[45] Whitney K Newey and Daniel McFadden. Large sample estimation and hypothesis testing. Handbook of econometrics, 4:2111–2245, 1994.10.1016/S1573-4412(05)80005-4Search in Google Scholar

[46] J Neyman and ES Pearson. On the problem of the most efficient tests of statistical hypotheses. Phil. Trans. R. Soc. Lond, pages 289–337, 1933.10.1098/rsta.1933.0009Search in Google Scholar

[47] Jerzy Neyman and Egon S Pearson. On the use and interpretation of certain test criteria for purposes of statistical inference: Part i. Biometrika, pages 175–240, 1928.10.1093/biomet/20A.1-2.175Search in Google Scholar

[48] Ryan Rogers, Aaron Roth, Adam Smith, and Om Thakkar. Max-information, differential privacy, and post-selection hypothesis testing. arXiv preprint arXiv:1604.03924, 2016.Search in Google Scholar

[49] Albert Satorra and Willem E Saris. Power of the likelihood ratio test in covariance structure analysis. Psychometrika, 50(1):83–90, 1985.10.1007/BF02294150Search in Google Scholar

[50] Lawrence A Shepp and Yehuda Vardi. Maximum likelihood reconstruction for emission tomography. IEEE transactions on medical imaging, 1(2):113–122, 1982.10.1109/TMI.1982.4307558Search in Google Scholar

[51] David Sommer, Sebastian Meiser, and Esfandiar Mohammadi. Privacy loss classes: The central limit theorem in differential privacy. Proceedings on privacy enhancing technologies, 2019.10.2478/popets-2019-0029Search in Google Scholar

[52] Shuang Song, Yizhen Wang, and Kamalika Chaudhuri. Pufferfish privacy mechanisms for correlated data. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1291–1306. ACM, 2017.10.1145/3035918.3064025Search in Google Scholar

[53] Jun Tang, Aleksandra Korolova, Xiaolong Bai, Xueqiang Wang, and Xiaofeng Wang. Privacy loss in apple’s implementation of differential privacy on macos 10.12. arXiv preprint arXiv:1709.02753, 2017.Search in Google Scholar

[54] Michael Carl Tschantz, Shayak Sen, and Anupam Datta. Differential privacy as a causal property. arXiv preprint arXiv:1710.05899, 2017.Search in Google Scholar

[55] Yiannis Tsiounis and Moti Yung. On the security of elgamal based encryption. In International Workshop on Public Key Cryptography, pages 117–134. Springer, 1998.10.1007/BFb0054019Search in Google Scholar

[56] Cornelis Joost van Rijsbergen. Information retrieval. In Butterworth-Heinemann Newton, MA, USA, 1979.Search in Google Scholar

[57] Quang H Vuong. Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica: Journal of the Econometric Society, pages 307–333, 1989.10.2307/1912557Search in Google Scholar

[58] Yue Wang, Jaewoo Lee, and Daniel Kifer. Differentially private hypothesis testing, revisited. ArXiv e-prints, 2015.Search in Google Scholar

[59] Stanley L Warner. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63–69, 1965.Search in Google Scholar

[60] Larry Wasserman and Shuheng Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375–389, 2010.10.1198/jasa.2009.tm08651Search in Google Scholar

[61] Rand R Wilcox. Introduction to robust estimation and hypothesis testing. Academic press, 2011.10.1016/B978-0-12-386983-8.00001-9Search in Google Scholar

[62] Xiaotong Wu, Taotao Wu, Maqbool Khan, Qiang Ni, and Wanchun Dou. Game theory based correlated privacy preserving analysis in big data. IEEE Transactions on Big Data, 2017.Search in Google Scholar

[63] Yonghui Xiao and Li Xiong. Protecting locations with differential privacy under temporal correlations. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 1298–1309. ACM, 2015.10.1145/2810103.2813640Search in Google Scholar

[64] Bin Yang, Issei Sato, and Hiroshi Nakagawa. Bayesian differential privacy on correlated data. In Proceedings of the 2015 ACM SIGMOD international conference on Management of Data, pages 747–762. ACM, 2015.10.1145/2723372.2747643Search in Google Scholar

[65] Tianqing Zhu, Ping Xiong, Gang Li, and Wanlei Zhou. Correlated differential privacy: Hiding information in non-iid dataset. Information Forensics and Security, IEEE Transactions on, 2013.Search in Google Scholar

Empfohlene Artikel von Trend MD

Planen Sie Ihre Fernkonferenz mit Scienceendo