1. bookVolume 2022 (2022): Issue 2 (April 2022)
Journal Details
License
Format
Journal
eISSN
2299-0984
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

User-Level Label Leakage from Gradients in Federated Learning

Published Online: 03 Mar 2022
Volume & Issue: Volume 2022 (2022) - Issue 2 (April 2022)
Page range: 227 - 244
Received: 31 Aug 2021
Accepted: 16 Dec 2021
Journal Details
License
Format
Journal
eISSN
2299-0984
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

Federated learning enables multiple users to build a joint model by sharing their model updates (gradients), while their raw data remains local on their devices. In contrast to the common belief that this provides privacy benefits, we here add to the very recent results on privacy risks when sharing gradients. Specifically, we investigate Label Leakage from Gradients (LLG), a novel attack to extract the labels of the users’ training data from their shared gradients. The attack exploits the direction and magnitude of gradients to determine the presence or absence of any label. LLG is simple yet effective, capable of leaking potential sensitive information represented by labels, and scales well to arbitrary batch sizes and multiple classes. We mathematically and empirically demonstrate the validity of the attack under different settings. Moreover, empirical results show that LLG successfully extracts labels with high accuracy at the early stages of model training. We also discuss different defense mechanisms against such leakage. Our findings suggest that gradient compression is a practical technique to mitigate the attack.

Keywords

[1] Martín Abadi, H. Brendan McMahan, Andy Chu, Ilya Mironov, Li Zhang, Ian Goodfellow, and Kunal Talwar. Deep learning with differential privacy. In Proceedings of the ACM Conference on Computer and Communications Security, 2016.10.1145/2976749.2978318 Search in Google Scholar

[2] Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al. Privacy-preserving deep learning: Revisited and enhanced. In International Conference on Applications and Techniques in Information Security, pages 100–110. Springer, 2017.10.1007/978-981-10-5421-1_9 Search in Google Scholar

[3] Yoshinori Aono, Takuya Hayashi, Lihua Wang, Shiho Moriai, et al. Privacy-preserving deep learning via additively homomorphic encryption. IEEE Transactions on Information Forensics and Security, 13(5):1333–1345, 2017.10.1109/TIFS.2017.2787987 Search in Google Scholar

[4] Jacob Benesty, Jingdong Chen, Yiteng Huang, and Israel Cohen. Pearson correlation coefficient. In Noise reduction in speech processing, pages 1–4. Springer, 2009.10.1007/978-3-642-00296-0_5 Search in Google Scholar

[5] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. Practical secure aggregation for federated learning on user-held data. arXiv preprint arXiv:1611.04482, 2016. Search in Google Scholar

[6] Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. Practical secure aggregation for privacy-preserving machine learning. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 1175–1191, 2017.10.1145/3133956.3133982 Search in Google Scholar

[7] Harald Cramér. Mathematical methods of statistics, volume 43. Princeton university press, 1999. Search in Google Scholar

[8] Matthew F Daley, Kristin Goddard, Melissa McClung, Arthur Davidson, Gretchen Weiss, Ted Palen, Carsie Nyirenda, Richard Platt, Brooke Courtney, and Marsha E Reichman. Using a handheld device for patient data collection: a pilot for medical countermeasures surveillance. Public Health Reports, 131(1):30–34, 2016. Search in Google Scholar

[9] Sinead Duane, Meera Tandan, Andrew W Murphy, and Akke Vellinga. Using mobile phones to collect patient data: lessons learned from the simple study. JMIR research protocols, 6(4):e61, 2017. Search in Google Scholar

[10] David Enthoven and Zaid Al-Ars. Fidel: Reconstructing private training samples from weight updates in federated learning. arXiv preprint arXiv:2101.00159, 2021. Search in Google Scholar

[11] Mona Flores, Ittai Dayan, Holger Roth, Aoxiao Zhong, Ahmed Harouni, Amilcare Gentili, Anas Abidin, Andrew Liu, Anthony Costa, Bradford Wood, et al. Federated learning used for predicting outcomes in sars-cov-2 patients. Research Square, 2021.10.21203/rs.3.rs-126892/v1780545833442676 Search in Google Scholar

[12] Jonas Geiping, Hartmut Bauermeister, Hannah Dröge, and Michael Moeller. Inverting gradients - how easy is it to break privacy in federated learning? In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors, Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual, 2020. Search in Google Scholar

[13] R. C. Geyer, T. Klein, and M. Nabi. Differentially Private Federated Learning: A Client Level Perspective. ArXiv e-prints, December 2017. Search in Google Scholar

[14] Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press, 2016. http://www.deeplearningbook.org. Search in Google Scholar

[15] Meng Hao, Hongwei Li, Xizhao Luo, Guowen Xu, Haomiao Yang, and Sen Liu. Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Transactions on Industrial Informatics, 16(10):6532–6542, 2019.10.1109/TII.2019.2945367 Search in Google Scholar

[16] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016. Search in Google Scholar

[17] Arthur Jochems, Timo M Deist, Issam El Naqa, Marc Kessler, Chuck Mayo, Jackson Reeves, Shruti Jolly, Martha Matuszak, Randall Ten Haken, Johan van Soest, et al. Developing and validating a survival prediction model for nsclc patients through distributed learning across 3 countries. International Journal of Radiation Oncology* Biology* Physics, 99(2):344–352, 2017.10.1016/j.ijrobp.2017.04.021557536028871984 Search in Google Scholar

[18] Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Keith Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977, 2019. Search in Google Scholar

[19] Alex Krizhevsky, Geoffrey Hinton, et al. Learning multiple layers of features from tiny images. MIT, 2009. Search in Google Scholar

[20] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.10.1109/5.726791 Search in Google Scholar

[21] Oscar Li, Jiankai Sun, Weihao Gao, Hongyi Zhang, Xin Yang, Junyuan Xie, and Chong Wang. Label leakage and protection in two-party split learning. NeurIPS 2020 Workshop on Scalability, Privacy, and Security in Federated Learning (SpicyFL), 2020. Search in Google Scholar

[22] Qinbin Li, Zeyi Wen, and Bingsheng He. Federated learning systems: Vision, hype and reality for data privacy and protection. arXiv preprint arXiv:1907.09693, 2019. Search in Google Scholar

[23] Yujun Lin, Song Han, Huizi Mao, Yu Wang, and William J Dally. Deep gradient compression: Reducing the communication bandwidth for distributed training. arXiv preprint arXiv:1712.01887, 2017. Search in Google Scholar

[24] Dong C Liu and Jorge Nocedal. On the limited memory bfgs method for large scale optimization. Mathematical programming, 45(1):503–528, 1989.10.1007/BF01589116 Search in Google Scholar

[25] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of the IEEE international conference on computer vision, pages 3730–3738, 2015.10.1109/ICCV.2015.425 Search in Google Scholar

[26] Xinjian Luo, Yuncheng Wu, Xiaokui Xiao, and Beng Chin Ooi. Feature inference attack on model predictions in vertical federated learning. arXiv preprint arXiv:2010.10152, 2020. Search in Google Scholar

[27] H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629, 2016. Search in Google Scholar

[28] Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh, Hamed Haddadi, and Soteris Demetriou. Layer-wise characterization of latent information leakage in federated learning. arXiv preprint arXiv:2010.08762, 2020. Search in Google Scholar

[29] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. Reading digits in natural images with unsupervised feature learning. CiteSeerX, 2011. Search in Google Scholar

[30] Anastasia Pustozerova and Rudolf Mayer. Information leaks in federated learning. In Proceedings of the Network and Distributed System Security Symposium, 2020.10.14722/diss.2020.23004 Search in Google Scholar

[31] Jia Qian and Lars Kai Hansen. What can we learn from gradients? arXiv preprint arXiv:2010.15718, 2020. Search in Google Scholar

[32] Reza Shokri and Vitaly Shmatikov. Privacy-Preserving Deep Learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pages 1310—-1321, New York, NY, USA, 2015. Association for Computing Machinery.10.1145/2810103.2813687 Search in Google Scholar

[33] Yusuke Tsuzuku, Hiroto Imachi, and Takuya Akiba. Variance-based gradient compression for efficient distributed deep learning. arXiv preprint arXiv:1802.06058, 2018. Search in Google Scholar

[34] Aidmar Wainakh, Till Müßig, Tim Grube, and Max Mühlhäuser. Label leakage from gradients in distributed machine learning. In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), pages 1–4. IEEE, 2021.10.1109/CCNC49032.2021.9369498 Search in Google Scholar

[35] Zhibo Wang, Mengkai Song, Zhifei Zhang, Yang Song, Qian Wang, and Hairong Qi. Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pages 2512–2520. IEEE, 2019.10.1109/INFOCOM.2019.8737416 Search in Google Scholar

[36] Wenqi Wei, Ling Liu, Margaret Loper, Ka-Ho Chow, Mehmet Emre Gursoy, Stacey Truex, and Yanzhao Wu. A framework for evaluating client privacy leakages in federated learning. In European Symposium on Research in Computer Security, pages 545–566. Springer, 2020.10.1007/978-3-030-58951-6_27 Search in Google Scholar

[37] Jingwen Zhang, Jiale Zhang, Junjun Chen, and Shui Yu. Gan enhanced membership inference: A passive local attack in federated learning. In ICC 2020-2020 IEEE International Conference on Communications (ICC), pages 1–6. IEEE, 2020.10.1109/ICC40277.2020.9148790 Search in Google Scholar

[38] Bo Zhao, Konda Reddy Mopuri, and Hakan Bilen. idlg: Improved deep leakage from gradients. arXiv preprint arXiv:2001.02610, 2020. Search in Google Scholar

[39] Huafei Zhu, Zengxiang Li, Merivyn Cheah, and Rick Siow Mong Goh. Privacy-preserving weighted federated learning within oracle-aided mpc framework. arXiv preprint arXiv:2003.07630, 2020. Search in Google Scholar

[40] Junyi Zhu and Matthew Blaschko. R-gap: Recursive gradient attack on privacy. arXiv preprint arXiv:2010.07733, 2020. Search in Google Scholar

[41] Ligeng Zhu, Zhijian Liu, and Song Han. Deep leakage from gradients. In Advances in Neural Information Processing Systems, pages 14747–14756, 2019. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo