1. bookVolume 2020 (2020): Issue 4 (October 2020)
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Publishing Community-Preserving Attributed Social Graphs with a Differential Privacy Guarantee

Published Online: 17 Aug 2020
Page range: 131 - 152
Received: 29 Feb 2020
Accepted: 16 Jun 2020
Journal Details
License
Format
Journal
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English

We present a novel method for publishing differentially private synthetic attributed graphs. Our method allows, for the first time, to publish synthetic graphs simultaneously preserving structural properties, user attributes and the community structure of the original graph. Our proposal relies on CAGM, a new community-preserving generative model for attributed graphs. We equip CAGM with efficient methods for attributed graph sampling and parameter estimation. For the latter, we introduce differentially private computation methods, which allow us to release communitypreserving synthetic attributed social graphs with a strong formal privacy guarantee. Through comprehensive experiments, we show that our new model outperforms its most relevant counterparts in synthesising differentially private attributed social graphs that preserve the community structure of the original graph, as well as degree sequences and clustering coefficients.

Keywords

[1] Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/ec (general data protection regulation). OJ, L 119:1–88, 4.5.2016.Search in Google Scholar

[2] Réka Albert and Albert-László Barabási. Statistical mechanics of complex networks. Review of Modern Physics, 74:47–97, 2002.Search in Google Scholar

[3] Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. Differentially private data analysis of social networks via restricted sensitivity. In Proc. 4th Innovations in Theoretical Computer Science (ITCS), pages 87–96. ACM Press, 2013.Search in Google Scholar

[4] Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008, 2008.Search in Google Scholar

[5] Jordi Casas-Roma, Jordi Herrera-Joancomartí, and Vicenç Torra. An algorithm for k-degree anonymity on large networks. In Procs. of the 2013 IEEE/ACM Int’l Conf. on Advances in Social Networks Analysis and Mining, pages 671–675, 2013.Search in Google Scholar

[6] Jordi Casas-Roma, Jordi Herrera-Joancomartí, and Vicenç Torra. Anonymizing graphs: measuring quality for clustering. Knowledge and Information Systems, 44(3):507–528, 2015.Search in Google Scholar

[7] Jordi Casas-Roma, Jordi Herrera-Joancomartí, and Vicenç Torra. k-degree anonymity and edge selection: improving data utility in large networks. Knowledge and Information Systems, 50(2):447–474, 2017.Search in Google Scholar

[8] Lauren E. Charles-Smith, Tera L. Reynolds, Mark A. Cameron, Mike Conway, Eric H. Y. Lau, Jennifer M. Olsen, Julie A. Pavlin, Mika Shigematsu, Laura C. Streichert, Katie J. Suda, and Courtney D. Corley. Using social media for actionable disease surveillance and outbreak management: A systematic literature review. PLOS ONE, 10(10):1–20, 10 2015.Search in Google Scholar

[9] James Cheng, Ada Wai-chee Fu, and Jia Liu. Kisomorphism: privacy preserving network publication against structural attacks. In Procs. of the 2010 ACM SIGMOD Int’l Conf. on Management of Data, pages 459–470, 2010.Search in Google Scholar

[10] Sean Chester, Bruce M Kapron, Ganesh Ramesh, Gautam Srivastava, Alex Thomo, and S Venkatesh. Why waldo befriended the dummy? k-anonymization of social networks with pseudo-nodes. Social Network Analysis and Mining, 3(3):381–399, 2013.Search in Google Scholar

[11] Fan Chung and Linyuan Lu. The average distances in random graphs with given expected degrees. Proceedings of the National Academy of Sciences, 99(25):15879–15882, 2002.Search in Google Scholar

[12] Aaron Clauset, Cristopher Moore, and M. E. J. Newman. Hierarchical structure and the prediction of missing links in networks. Nature, 453:98–101, 2008.Search in Google Scholar

[13] Cynthia Dwork. Differential privacy. In Proc. 33rd International Colloquium on Automata, Languages and Programming (ICALP), volume 4052 of Lecture Notes in Computer Science, pages 1–12. Springer, 2006.Search in Google Scholar

[14] Luoyi Fu, Xinzhe Fu, Zhongzhao Hu, Zhiying Xu, and Xinbing Wang. De-anonymization of social networks with communities: When quantifications meet algorithms. arXiv preprint arXiv:1703.09028, 2017.Search in Google Scholar

[15] Michael Hay, Chao Li, Gerome Miklau, and David D. Jensen. Accurate estimation of the degree distribution of private networks. In Proc. 19th IEEE International Conference on Data Mining (ICDM), pages 169–178. IEEE Computer Society, 2009.Search in Google Scholar

[16] Michael Hay, Gerome Miklau, David D. Jensen, Donald F. Towsley, and Philipp Weis. Resisting structural re-identification in anonymized social networks. PVLDB, 1(1):102–114, 2008.Search in Google Scholar

[17] Joseph J. Pfeiffer III, Timothy La Fond, Sebastián Moreno, and Jennifer Neville. Fast generation of large scale social networks while incorporating transitive closures. In Proc. 4th International Conference on Privacy, Security, Risk and Trust, (PASSAT), pages 154–165. IEEE Computer Society, 2012.Search in Google Scholar

[18] Joseph J. Pfeiffer III, Sebastián Moreno, Timothy La Fond, Jennifer Neville, and Brian Gallagher. Attributed graph models: modeling network structure with correlated attributes. In Proc. 23rd International World Wide Web Conference (WWW), pages 831–842. ACM Press, 2014.Search in Google Scholar

[19] Shouling Ji, Weiqing Li, Prateek Mittal, Xin Hu, and Raheem Beyah. Secgraph: A uniform and open-source evaluation system for graph data anonymization and deanonymization. In 24th {USENIX} Security Symposium ({USENIX} Security 15), pages 303–318, 2015.Search in Google Scholar

[20] Zach Jorgensen, Ting Yu, and Graham Cormode. Publishing attributed social graphs with formal privacy guarantees. In Proc. 2016 International Conference on Management of Data (SIGMOD), pages 107–122. ACM Press, 2016.Search in Google Scholar

[21] Kundan Kandhway and Joy Kuri. Using node centrality and optimal control to maximize information diffusion in social networks. IEEE Trans. Systems, Man, and Cybernetics: Systems, 47(7):1099–1110, 2017.Search in Google Scholar

[22] Brian Karrer and Mark EJNewman. Stochastic blockmodels and community structure in networks. Physical review. E, 83(1):016107, 2011.Search in Google Scholar

[23] Vishesh Karwa, Sofya Raskhodnikova, Adam D. Smith, and Grigory Yaroslavtsev. Private analysis of graph structure. ACM Transactions on Database Systems, 39(3):22:1–22:33, 2014.Search in Google Scholar

[24] Vishesh Karwa and Aleksandra B. Slavkovic. Differentially private graphical degree sequences and synthetic graphs. In Proc. 2012 International Conference on Privacy in Statistical Databases (PSD), volume 7556 of Lecture Notes in Computer Science, pages 273–285. Springer, 2012.Search in Google Scholar

[25] Daniel Kifer and Bing-Rong Lin. Towards an axiomatization of statistical privacy and utility. In Proc. 29th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pages 147–158. ACM Press, 2010.Search in Google Scholar

[26] Tamara G. Kolda, Ali Pinar, Todd D. Plantenga, and C. Seshadhri. A scalable generative graph model with community structure. SIAM J. Scientific Computing, 36(5), 2014.Search in Google Scholar

[27] Jérôme Kunegis. KONECT: the koblenz network collection. In Proc. 22nd International World Wide Web Conference (WWW), pages 1343–1350. ACM Press, 2013.Search in Google Scholar

[28] Christine Largeron, Pierre-Nicolas Mougel, Oualid Benyahia, and Osmar R Zaïane. Dancer: dynamic attributed networks with community structure generation. Knowledge and Information Systems, 53(1):109–151, 2017.Search in Google Scholar

[29] Christine Largeron, Pierre-Nicolas Mougel, Reihaneh Rabbany, and Osmar R Zaïane. Generating attributed networks with communities. PloS one, 10(4), 2015.Search in Google Scholar

[30] Jure Leskovec and Christos Faloutsos. Scalable modeling of real graphs using kronecker multiplication. In Proc. 24th International Conference on Machine Learning (ICML), pages 497–504. ACM Press, 2007.Search in Google Scholar

[31] Jure Leskovec and Andrej Krevl. SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, 2014.Search in Google Scholar

[32] Changchang Liu, Supriyo Chakraborty, and Prateek Mittal. Dependence makes you vulnberable: Differential privacy under dependent tuples. In Procs. of NDSS 2016, volume 16, pages 21–24, 2016.Search in Google Scholar

[33] Kun Liu and Evimaria Terzi. Towards identity anonymization on graphs. In Proc. 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD), pages 93–106. ACM Press, 2008.Search in Google Scholar

[34] Xuesong Lu, Yi Song, and Stéphane Bressan. Fast identity anonymization on graphs. In Procs. of the Int’l Conf. on Database and Expert Systems Applications, pages 281–295, 2012.Search in Google Scholar

[35] Tinghuai Ma, Yuliang Zhang, Jie Cao, Jian Shen, Meili Tang, Yuan Tian, Abdullah Al-Dhelaan, and Mznah Al-Rodhaan. Kdvem: a k-degree anonymity with vertex and edge modification algorithm. Computing, 97(12):1165–1184, 2015.Search in Google Scholar

[36] Nelly Marquetoux, Mark A. Stevenson, Peter Wilson, Anne Ridler, and Cord Heuer. Using social network analysis to inform disease control interventions. Preventive Veterinary Medicine, 126:94–104, 2016.Search in Google Scholar

[37] Paolo Massa and Paolo Avesani. Trust-aware recommender systems. In Proc. 2007 ACM Conference on Recommender Systems (RecSys), pages 17–24. ACM Press, 2007.Search in Google Scholar

[38] Sjouke Mauw, Yunior Ramírez-Cruz, and Rolando Trujillo-Rasua. Anonymising social graphs in the presence of active attackers. Transactions on Data Privacy, 11(2):169–198, 2018.Search in Google Scholar

[39] Sjouke Mauw, Yunior Ramírez-Cruz, and Rolando Trujillo-Rasua. Conditional adjacency anonymity in social graphs under active attacks. Knowledge and Information Systems, 61(1):485–511, 2018.Search in Google Scholar

[40] Frank McSherry. Privacy integrated queries: an extensible platform for privacy-preserving data analysis. Communications of the ACM, 53(9):89–97, 2010.Search in Google Scholar

[41] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In Proc. 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS), pages 94–103. IEEE Computer Society, 2007.Search in Google Scholar

[42] Darakhshan J. Mir and Rebecca N. Wright. A differentially private graph estimator. In Proc. 2009 ICDM International Workshop on Privacy Aspects of Data Mining (ICDM), pages 122–129. IEEE Computer Society, 2009.Search in Google Scholar

[43] Prateek Mittal, Charalampos Papamanthou, and Dawn Xiaodong Song. Preserving link privacy in social network based systems. In Procs. of NDSS 2013. The Internet Society, 2013.Search in Google Scholar

[44] M. E. J. Newman and M. Girvan. Finding and evaluating community structure in networks. Physical Reviewe E, 69(2):026113, 2004.Search in Google Scholar

[45] Mark E. J. Newman. Community detection in networks: Modularity optimization and maximum likelihood are equivalent. CoRR, abs/1606.02319, 2016.Search in Google Scholar

[46] Hiep H. Nguyen, Abdessamad Imine, and Michaël Rusinowitch. Detecting communities under differential privacy. In Proc. 2016 ACM Workshop on Privacy in the Electronic Society (WPES), pages 83–93. ACM Press, 2016.Search in Google Scholar

[47] Shirin Nilizadeh, Apu Kapadia, and Yong-Yeol Ahn. Community-enhanced de-anonymization of online social networks. In Proceedings of the 2014 acm sigsac conference on computer and communications security, pages 537–548, 2014.Search in Google Scholar

[48] Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. In Proc. 39th Annual ACM Symposium on Theory of Computing (STOC), pages 75–84. ACM Press, 2007.Search in Google Scholar

[49] Liudmila Ostroumova Prokhorenkova and Alexey Tikhonov. Community detection through likelihood optimization: In search of a sound model. In Proc. 30th World Wide Web Conference (WWW), pages 1498–1508. ACM Press, 2019.Search in Google Scholar

[50] François Rousseau, Jordi Casas-Roma, and Michalis Vazirgiannis. Community-preserving anonymization of graphs. Knowledge and Information Systems, 54(2):315–343, 2017.Search in Google Scholar

[51] Alessandra Sala, Xiaohan Zhao, Christo Wilson, Haitao Zheng, and Ben Y. Zhao. Sharing graphs using differentially private graph models. In Proc. 11th ACM SIGCOMM Internet Measurement Conference (IMC), pages 81–98. ACM Press, 2011.Search in Google Scholar

[52] Julián Salas and Vicenç Torra. Graphic sequences, distances and k-degree anonymity. Discrete Applied Mathematics, 188:25–31, 2015.Search in Google Scholar

[53] Yazhe Wang, Long Xie, Baihua Zheng, and Ken CK Lee. High utility k-anonymization for social network publishing. Knowledge and Information Systems, 41(3):697–725, 2014.Search in Google Scholar

[54] Yue Wang and Xintao Wu. Preserving differential privacy in degree-correlation based graph generation. Transaction on Data Privacy, 6(2):127–145, 2013.Search in Google Scholar

[55] Yue Wang, Xintao Wu, Jun Zhu, and Yang Xiang. On learning cluster coefficient of private networks. Social Network Analysis and Mining, 3(4):925–938, 2013.Search in Google Scholar

[56] Paul W.Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps. Social Networks, 5(2):109–137, 1983.Search in Google Scholar

[57] Gilbert Wondracek, Thorsten Holz, Engin Kirda, and Christopher Kruegel. A practical attack to de-anonymize social network users. In 2010 IEEE Symposium on Security and Privacy, pages 223–238. IEEE, 2010.Search in Google Scholar

[58] Wentao Wu, Yanghua Xiao, Wei Wang, Zhenying He, and Zhihui Wang. K-symmetry model for identity anonymization in social networks. In Procs. of the 13th Int’l Conf. on Extending Database Technology, pages 111–122, 2010.Search in Google Scholar

[59] Qian Xiao, Rui Chen, and Kian-Lee Tan. Differentially private network data release via structural inference. In Proc. 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pages 911–920. ACM Press, 2014.Search in Google Scholar

[60] Jaewon Yang and Jure Leskovec. Overlapping community detection at scale: a nonnegative matrix factorization approach. In Proc. 6th ACM International Conference on Web Search and Data Mining (WSDM), pages 587–596. ACM Press, 2013.Search in Google Scholar

[61] Jaewon Yang and Jure Leskovec. Defining and evaluating network communities based on ground-truth. Knowledge and Information Systems, 42(1):181–213, 2015.Search in Google Scholar

[62] Jaewon Yang, Julian J. McAuley, and Jure Leskovec. Community detection in networks with node attributes. In Proc. 13th IEEE International Conference on Data Mining (ICDM), pages 1151–1156. IEEE Computer Society, 2013.Search in Google Scholar

[63] Jun Zhang, Graham Cormode, Cecilia M. Procopiuc, Divesh Srivastava, and Xiaokui Xiao. Private release of graph statistics using ladder functions. In Proc. 36th ACM International Conference on Management of Data (SIGMOD), pages 731–745. ACM Press, 2015.Search in Google Scholar

[64] Yang Zhang, Mathias Humbert, Bartlomiej Surma, Praveen Manoharan, Jilles Vreeken, and Michael Backes. Towards plausible graph anonymization. In Procs. of NDSS 2020, 2020.Search in Google Scholar

[65] Bin Zhou and Jian Pei. The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks. Knowledge Information Systems, 28(1):47–77, 2011.Search in Google Scholar

[66] Dmitry Zinoviev. Information Diffusion in Social Networks, pages 146–163. 11 2011.Search in Google Scholar

[67] Lei Zou, Lei Chen, and M. Tamer Özsu. K-automorphism: A general framework for privacy preserving network publication. PVLDB, 2(1):946–957, 2009.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo