1. bookVolume 31 (2021): Issue 3 (September 2021)
Journal Details
License
Format
Journal
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
access type Open Access

Fitting a Gaussian Mixture Model Through the Gini Index

Published Online: 27 Sep 2021
Page range: 487 - 500
Received: 29 Mar 2021
Accepted: 01 Jul 2021
Journal Details
License
Format
Journal
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
Abstract

A linear combination of Gaussian components is known as a Gaussian mixture model. It is widely used in data mining and pattern recognition. In this paper, we propose a method to estimate the parameters of the density function given by a Gaussian mixture model. Our proposal is based on the Gini index, a methodology to measure the inequality degree between two probability distributions, and consists in minimizing the Gini index between an empirical distribution for the data and a Gaussian mixture model. We will show several simulated examples and real data examples, observing some of the properties of the proposed method.

Keywords

Bassetti, F., Bodini, A. and Regazzini, E. (2006). On minimum Kantorovich distance estimators, Statistics and Probability Letters 76(12): 1298–1302. Search in Google Scholar

Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Springer, New York. Search in Google Scholar

Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society: Series B (Methodological) 39(1): 1–22. Search in Google Scholar

Elkan, C. (1997). Boosting and naive Bayesian learning, Proceedings of the International Conference on Knowledge Discovery and Data Mining, Newport Beach, USA. Search in Google Scholar

Flach, P.A. and Lachiche, N. (2004). Naive Bayesian classification of structured data, Machine Learning 57(3): 233–269. Search in Google Scholar

Giorgi, G.M. and Gigliarano, C. (2017). The Gini concentration index: A review of the inference literature, Journal of Economic Surveys 31(4): 1130–1148. Search in Google Scholar

Greenspan, H., Ruf, A. and Goldberger, J. (2006). Constrained Gaussian mixture model framework for automatic segmentation of MR brain images, IEEE Transactions on Medical Imaging 25(9): 1233–1245. Search in Google Scholar

Kłopotek, R., Kłopotek, M. and Wierzchoń, S. (2020). A feasible k-means kernel trick under non-Euclidean feature space, International Journal of Applied Mathematics and Computer Science 30(4): 703–715, DOI: 10.34768/amcs-2020-0052. Search in Google Scholar

Kulczycki, P. (2018). Kernel estimators for data analysis, in M. Ram and J.P. Davim (Eds), Advanced Mathematical Techniques in Engineering Sciences, CRC/Taylor & Francis, Boca Raton, pp. 177–202. Search in Google Scholar

López-Lobato, A.L. and Avendaño-Garrido, M.L. (2020). Using the Gini index for a Gaussian mixture model, in L. Martínez-Villaseñor et al. (Eds), Advances in Computational Intelligence. MICAI 2020, Lecture Notes in Computer Science, Vol. 12469, Springer, Cham, pp. 403–418. Search in Google Scholar

Mao, C., Lu, L. and Hu, B. (2020). Local probabilistic model for Bayesian classification: A generalized local classification model, Applied Soft Computing 93: 106379. Search in Google Scholar

Meng, X.-L. and Rubin, D.B. (1994). On the global and componentwise rates of convergence of the EM algorithm, Linear Algebra and its Applications 199(Supp. 1): 413–425. Search in Google Scholar

Povey, D., Burget, L., Agarwal, M., Akyazi, P., Kai, F., Ghoshal, A., Glembek, O., Goel, N., Karafiát, M., Rastrow, A., Rose, R., Schwarz, P. and Thomas, S. (2011). The subspace Gaussian mixture model: A structured model for speech recognition, Computer Speech & Language 25(2): 404–439. Search in Google Scholar

Rachev, S., Klebanov, L., Stoyanov, S. and Fabozzi, F. (2013). The Methods of Distances in the Theory of Probability and Statistics, Springer, New York, pp. 659–663. Search in Google Scholar

Reynolds, D.A. (2009). Gaussian mixture models, in S.Z. Li (Ed.), Encyclopedia of Biometrics, Springer, New York, pp. 659–663. Search in Google Scholar

Rubner, Y., Tomasi, C. and Guibas, L.J. (2000). The Earth mover’s distance as a metric for image retrieval, International Journal of Computer Vision 40(2): 99–121. Search in Google Scholar

Singh, R., Pal, B.C. and Jabr, R.A. (2009). Statistical representation of distribution system loads using Gaussian mixture model, IEEE Transactions on Power Systems 25(1): 29–37. Search in Google Scholar

Torres-Carrasquillo, P.A., Reynolds, D.A. and Deller, J.R. (2002). Language identification using Gaussian mixture model tokenization, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, Orlando, USA, pp. I–757. Search in Google Scholar

Ultsch, A. and Lötsch, J. (2017). A data science based standardized Gini index as a Lorenz dominance preserving measure of the inequality of distributions, PloS One 12(8): e0181572. Search in Google Scholar

Vaida, F. (2005). Parameter convergence for EM and MM algorithms, Statistica Sinica 15(2005): 831–840. Search in Google Scholar

Villani, C. (2003). Topics in Optimal Transportation, American Mathematical Society, Providence. Search in Google Scholar

Xu, L. and Jordan, M.I. (1996). On convergence properties of the EM algorithm for Gaussian mixtures, Neural Computation 8(1): 129–151. Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo