1. bookVolume 2019 (2019): Issue 2 (April 2019)
Journal Details
License
Format
Journal
eISSN
2299-0984
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
access type Open Access

Together or Alone: The Price of Privacy in Collaborative Learning

Published Online: 04 May 2019
Page range: 47 - 65
Received: 31 Aug 2018
Accepted: 16 Dec 2018
Journal Details
License
Format
Journal
eISSN
2299-0984
First Published
16 Apr 2015
Publication timeframe
4 times per year
Languages
English
Abstract

Machine learning algorithms have reached mainstream status and are widely deployed in many applications. The accuracy of such algorithms depends significantly on the size of the underlying training dataset; in reality a small or medium sized organization often does not have the necessary data to train a reasonably accurate model. For such organizations, a realistic solution is to train their machine learning models based on their joint dataset (which is a union of the individual ones). Unfortunately, privacy concerns prevent them from straightforwardly doing so. While a number of privacy-preserving solutions exist for collaborating organizations to securely aggregate the parameters in the process of training the models, we are not aware of any work that provides a rational framework for the participants to precisely balance the privacy loss and accuracy gain in their collaboration.

In this paper, by focusing on a two-player setting, we model the collaborative training process as a two-player game where each player aims to achieve higher accuracy while preserving the privacy of its own dataset. We introduce the notion of Price of Privacy, a novel approach for measuring the impact of privacy protection on the accuracy in the proposed framework. Furthermore, we develop a game-theoretical model for different player types, and then either find or prove the existence of a Nash Equilibrium with regard to the strength of privacy protection for each player. Using recommendation systems as our main use case, we demonstrate how two players can make practical use of the proposed theoretical framework, including setting up the parameters and approximating the non-trivial Nash Equilibrium.

Keywords

[1] Michela Chessa, Jens Grossklags, and Patrick Loiseau. A game-theoretic study on non-monetary incentives in data analytics projects with privacy implications. In Computer Security Foundations Symposium (CSF), 2015 IEEE 28th. IEEE, 2015.10.1109/CSF.2015.14Search in Google Scholar

[2] Cynthia Dwork. Differential privacy. In Proceedings of the 33rd international conference on Automata, Languages and Programming. ACM, 2006.10.1007/11787006_1Search in Google Scholar

[3] Arik Friedman, Shlomo Berkovsky, and Mohamed Ali Kaafar. A differential privacy framework for matrix factorization recommender systems. User Modeling and User-Adapted Interaction, 2016.10.1007/s11257-016-9177-7Search in Google Scholar

[4] Jihun Hamm, Yingjun Cao, and Mikhail Belkin. Learning privately from multiparty data. In International Conference on Machine Learning, 2016.Search in Google Scholar

[5] J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers, 2012.Search in Google Scholar

[6] John C Harsanyi, Reinhard Selten, et al. A general theory of equilibrium selection in games. MIT Press Books, 1988.Search in Google Scholar

[7] Stratis Ioannidis and Patrick Loiseau. Linear regression as a non-cooperative game. In International Conference on Web and Internet Economics. Springer, 2013.10.1007/978-3-642-45046-4_23Search in Google Scholar

[8] Yehuda Koren, Robert Bell, and Chris Volinsky. Matrix factorization techniques for recommender systems. Computer, 2009.10.1109/MC.2009.263Search in Google Scholar

[9] Elias Koutsoupias and Christos Papadimitriou. Worst-case equilibria. In Stacs. Springer, 1999.10.1007/3-540-49116-3_38Search in Google Scholar

[10] Fabio Martinelli, Andrea Saracino, and Mina Sheikhalishahi. Modeling privacy aware information sharing systems: A formal and general approach. In Trustcom/BigDataSE/ISPA, 2016 IEEE, pages 767–774. IEEE, 2016.10.1109/TrustCom.2016.0137Search in Google Scholar

[11] H Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, et al. Communication-efficient learning of deep networks from decentralized data. arXiv preprint arXiv:1602.05629, 2016.Search in Google Scholar

[12] Dov Monderer and Lloyd S Shapley. Potential games. Games and economic behavior, 1996.10.1006/game.1996.0044Search in Google Scholar

[13] Manas Pathak, Shantanu Rane, and Bhiksha Raj. Multiparty differential privacy via aggregation of locally trained classifiers. In Advances in Neural Information Processing Systems, 2010.Search in Google Scholar

[14] Jeffrey Pawlick and Quanyan Zhu. A stackelberg game perspective on the conflict between machine learning and data obfuscation. In Information Forensics and Security (WIFS), 2016 IEEE International Workshop on, 2016.10.1109/WIFS.2016.7823893Search in Google Scholar

[15] Balazs Pejo. Matrix factorisation in matlab via stochastic gradient descent. https://github.com/pidzso/ML.Search in Google Scholar

[16] Arun Rajkumar and Shivani Agarwal. A differentially private stochastic gradient descent algorithm for multiparty classification. In Artificial Intelligence and Statistics, 2012.Search in Google Scholar

[17] Mina Sheikhalishahi and Fabio Martinelli. Privacy-utility feature selection as a privacy mechanism in collaborative data classification. In Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2017 IEEE 26th International Conference on, pages 244–249. IEEE, 2017.10.1109/WETICE.2017.15Search in Google Scholar

[18] Xiaotong Wu, Taotao Wu, Maqbool Khan, Qiang Ni, and Wanchun Dou. Game theory based correlated privacy preserving analysis in big data. IEEE Transactions on Big Data, 2017.Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo