1. bookVolume 2022 (2022): Issue 1 (January 2022)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Making the Most of Parallel Composition in Differential Privacy

Published Online: 20 Nov 2021
Page range: 253 - 273
Received: 31 May 2021
Accepted: 16 Sep 2021
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

We show that the ‘optimal’ use of the parallel composition theorem corresponds to finding the size of the largest subset of queries that ‘overlap’ on the data domain, a quantity we call the maximum overlap of the queries. It has previously been shown that a certain instance of this problem, formulated in terms of determining the sensitivity of the queries, is NP-hard, but also that it is possible to use graph-theoretic algorithms, such as finding the maximum clique, to approximate query sensitivity. In this paper, we consider a significant generalization of the aforementioned instance which encompasses both a wider range of differentially private mechanisms and a broader class of queries. We show that for a particular class of predicate queries, determining if they are disjoint can be done in time polynomial in the number of attributes. For this class, we show that the maximum overlap problem remains NP-hard as a function of the number of queries. However, we show that efficient approximate solutions exist by relating maximum overlap to the clique and chromatic numbers of a certain graph determined by the queries. The link to chromatic number allows us to use more efficient approximate algorithms, which cannot be done for the clique number as it may underestimate the privacy budget. Our approach is defined in the general setting of f-differential privacy, which subsumes standard pure differential privacy and Gaussian differential privacy. We prove the parallel composition theorem for f-differential privacy. We evaluate our approach on synthetic and real-world data sets of queries. We show that the approach can scale to large domain sizes (up to 1020000), and that its application can reduce the noise added to query answers by up to 60%.

Keywords

[1] Hassan Jameel Asghar, Ming Ding, Thierry Rakotoarivelo, Sirine Mrabet, and Dali Kaafar. Differentially private release of datasets using Gaussian copula. Journal of Privacy and Confidentiality, 10(2), 2020. Search in Google Scholar

[2] Daniel Brélaz. New methods to color the vertices of a graph. Communications of the ACM, 22(4):251–256, 1979. Search in Google Scholar

[3] Mark Bun and Thomas Steinke. Concentrated differential privacy: simplifications, extensions, and lower bounds. In Theory of Cryptography Conference, pages 635–658. Springer, 2016. Search in Google Scholar

[4] United States Census Bureau. Current population survey data. https://www.census.gov/programs-surveys/cps/data.html. Accessed 1 September 2021. Search in Google Scholar

[5] Timothy M Chan. Klee’s measure problem made easy. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 410–419. IEEE, 2013. Search in Google Scholar

[6] Jens Clausen. Branch and bound algorithms: principles and examples. Technical report, Department of Computer Science, University of Copenhagen, 1999. Search in Google Scholar

[7] Olivier Coudert. Exact coloring of real-life graphs is easy. In Proceedings of the 34th Annual Design Automation Conference, pages 121–126, 1997. Search in Google Scholar

[8] Transaction Processing Performance Council. TPCHomepage. http://www.tpc.org. Accessed 1 September 2021. Search in Google Scholar

[9] Apurba Das, Michael Svendsen, and Srikanta Tirthapura. Incremental maintenance of maximal cliques in a dynamic graph. The VLDB Journal, 28(3):351–375, 2019. Search in Google Scholar

[10] Reinhard Diestel. Graph Theory. Springer, 2005. Search in Google Scholar

[11] Jinshuo Dong, Aaron Roth, and Weijie J. Su. Gaussian differential privacy. arXiv preprint arXiv:1905.02383, 2019. Search in Google Scholar

[12] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third Conference on Theory of Cryptography, pages 265–284. Springer-Verlag, 2006. Search in Google Scholar

[13] Cynthia Dwork and Aaron Roth. The Algorithmic Foundations of Differential Privacy. Now, 2014. Search in Google Scholar

[14] Cynthia Dwork and Guy N. Rothblum. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016. Search in Google Scholar

[15] Hamid Ebadi, David Sands, and Gerardo Schneider. Differential privacy: now it’s getting personal. ACM SIGPLAN Notices, 50(1):69–81, 2015. Search in Google Scholar

[16] Jacob Fox and Benny Sudakov. Density theorems for bipartite graphs and related Ramsey-type results. Combinatorica, 29(2):153–196, 2009. Search in Google Scholar

[17] Aric A. Hagberg, Daniel A. Schult, and Pieter J. Swart. Exploring network structure, dynamics, and function using networkx. In Gaël Varoquaux, Travis Vaught, and Jarrod Millman, editors, Proceedings of the 7th Python in Science Conference, pages 11–15, 2008. Search in Google Scholar

[18] Moritz Hardt and Guy N. Rothblum. A multiplicative weights mechanism for privacy-preserving data analysis. In 2010 IEEE 51st Annual Symposium on Foundations of Computer Science, pages 61–70. IEEE, 2010. Search in Google Scholar

[19] Ali Inan, Mehmet Emre Gursoy, and Yucel Saygin. Sensitivity analysis for non-interactive differential privacy: bounds and efficient algorithms. IEEE Transactions on Dependable and Secure Computing, 17(1):194–207, 2017. Search in Google Scholar

[20] Peter Kairouz, Sewoong Oh, and Pramod Viswanath. The composition theorem for differential privacy. In International Conference on Machine Learning, pages 1376–1385. PMLR, 2015. Search in Google Scholar

[21] Daphne Koller and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques. MIT Press, 2009. Search in Google Scholar

[22] Janez Konc and Dušanka Janezic. An improved branch and bound algorithm for the maximum clique problem. Proteins, 4(5), 2007. Search in Google Scholar

[23] Rhyd Lewis. A Guide to Graph Colouring. Springer, 2015. Search in Google Scholar

[24] Ryan McKenna, Gerome Miklau, Michael Hay, and Ashwin Machanavajjhala. Optimizing error of high-dimensional statistical queries under differential privacy. Proceedings of the VLDB Endowment, 11(10):1206–1219, 2018. Search in Google Scholar

[25] Ryan McKenna, Daniel Sheldon, and Gerome Miklau. Graphical-model based estimation and inference for differential privacy. In International Conference on Machine Learning, pages 4435–4444. PMLR, 2019. Search in Google Scholar

[26] Frank D McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pages 19–30, 2009. Search in Google Scholar

[27] Ilya Mironov. Rényi differential privacy. In 2017 IEEE 30th Computer Security Foundations Symposium, pages 263–275. IEEE, 2017. Search in Google Scholar

[28] Noman Mohammed, Rui Chen, Benjamin C. M. Fung, and Philip S. Yu. Differentially private data release for data mining. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 493–501, 2011. Search in Google Scholar

[29] David R. Morrison, Sheldon H. Jacobson, Jason J. Sauppe, and Edward C. Sewell. Branch-and-bound algorithms: a survey of recent advances in searching, branching, and pruning. Discrete Optimization, 19:79–102, 2016. Search in Google Scholar

[30] Jack Murtagh and Salil Vadhan. The complexity of computing the optimal composition of differential privacy. In Theory of Cryptography Conference, pages 157–175. Springer, 2016. Search in Google Scholar

[31] Jan Mycielski. Sur le coloriage des graphes. Colloquium Mathematicae, 3(2):161–162, 1955. Search in Google Scholar

[32] Vitaly I. Voloshin. Introduction to Graph and Hypergraph Theory. Nova Science Publishers, 2009. Search in Google Scholar

[33] Duc Thach Son Vu. Numerical resolution of algebraic systems with complementarity conditions: application to the thermodynamics of compositional multiphase mixtures. PhD thesis, Université Paris-Saclay, 2020. Search in Google Scholar

[34] Larry Wasserman and Shuheng Zhou. A statistical framework for differential privacy. Journal of the American Statistical Association, 105(489):375–389, 2010. Search in Google Scholar

[35] Qinghua Wu and Jin-Kao Hao. A review on algorithms for maximum clique problems. European Journal of Operational Research, 242(3):693–709, 2015. Search in Google Scholar

[36] Xiaokui Xiao and Yufei Tao. Output perturbation with query relaxation. Proceedings of the VLDB Endowment, 1(1):857–869, 2008. Search in Google Scholar

[37] Dan Zhang, Ryan McKenna, Ios Kotsogiannis, Michael Hay, Ashwin Machanavajjhala, and Gerome Miklau. Ektelo: a framework for defining differentially-private computations. In Proceedings of the 2018 International Conference on Management of Data, pages 115–130, 2018. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo