1. bookVolume 25 (2015): Issue 4 (December 2015)
    Special issue: Complex Problems in High-Performance Computing Systems, Editors: Mauro Iacono, Joanna Kołodziej
Journal Details
License
Format
Journal
eISSN
2083-8492
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
access type Open Access

Optimization of the Maximum Likelihood Estimator for Determining the Intrinsic Dimensionality of High–Dimensional Data

Published Online: 30 Dec 2015
Page range: 895 - 913
Received: 11 Sep 2014
Journal Details
License
Format
Journal
eISSN
2083-8492
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
Abstract

One of the problems in the analysis of the set of images of a moving object is to evaluate the degree of freedom of motion and the angle of rotation. Here the intrinsic dimensionality of multidimensional data, characterizing the set of images, can be used. Usually, the image may be represented by a high-dimensional point whose dimensionality depends on the number of pixels in the image. The knowledge of the intrinsic dimensionality of a data set is very useful information in exploratory data analysis, because it is possible to reduce the dimensionality of the data without losing much information. In this paper, the maximum likelihood estimator (MLE) of the intrinsic dimensionality is explored experimentally. In contrast to the previous works, the radius of a hypersphere, which covers neighbours of the analysed points, is fixed instead of the number of the nearest neighbours in the MLE. A way of choosing the radius in this method is proposed. We explore which metric—Euclidean or geodesic—must be evaluated in the MLE algorithm in order to get the true estimate of the intrinsic dimensionality. The MLE method is examined using a number of artificial and real (images) data sets.

Keywords

Álvarez-Meza, A.M., Valencia-Aguirre, J., Daza-Santacoloma, G. and Castellanos-Domínguez, G. (2011). Global and local choice of the number of nearest neighbors in locally linear embedding, Pattern Recognition Letters32(16): 2171–2177.10.1016/j.patrec.2011.05.011Search in Google Scholar

Belkin, M. and Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation15(6): 1373–1396.10.1162/089976603321780317Search in Google Scholar

Brand, M. (2003). Charting a manifold, in S. Becker, S. Thrun and K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15, MIT Press, Cambridge, MA, pp. 961–968.Search in Google Scholar

Camastra, F. (2003). Data dimensionality estimation methods: A survey, Pattern Recognition36(12): 2945–2954.10.1016/S0031-3203(03)00176-6Search in Google Scholar

Carter, K.M., Raich, R. and Hero, A.O. (2010). On local intrinsic dimension estimation and its applications, IEEE Transactions on Signal Processing58(2): 650–663.10.1109/TSP.2009.2031722Search in Google Scholar

Chang, Y., Hu, C. and Turk, M. (2004). Probabilistic expression analysis on manifolds, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR(2), Washington, DC, USA, pp. 520–527.Search in Google Scholar

Costa, J.A. and Hero, A.O. (2004). Geodesic entropic graphs for dimension and entropy estimation in manifold learning, IEEE Transactions on Signal Processing52(8): 2210–2221.10.1109/TSP.2004.831130Search in Google Scholar

Costa, J.A. and Hero, A.O. (2005). Estimating local intrinsic dimension with k-nearest neighbor graphs, IEEE Transactions on Statistical Signal Processing30(23): 1432–1436.10.1109/SSP.2005.1628631Search in Google Scholar

Donoho, D.L. and Grimes, C. (2005). Hessian eigenmaps: New locally linear embedding techniques for high-dimensional data, Proceedings of the National Academy of Sciences102(21): 7426–7431.Search in Google Scholar

Dzemyda, G., Kurasova, O. and Žilinskas, J. (2013). Multidimensional Data Visualization: Methods and Applications, Optimization and Its Applications, Vol. 75, Springer-Verlag, New York, NY.Search in Google Scholar

Einbeck, J. and Kalantan, Z. (2013). Intrinsic dimensionality estimation for high-dimensional data sets: New approaches for the computation of correlation dimension, Journal of Emerging Technologies in Web Intelligence5(2): 91–97.10.4304/jetwi.5.2.91-97Search in Google Scholar

Elgammal, A. and su Lee, C. (2004a). Inferring 3d body pose from silhouettes using activity manifold learning, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR(2), Washington, DC, USA, pp. 681–688.Search in Google Scholar

Elgammal, A. and su Lee, C. (2004b). Separating style and content on a nonlinear manifold, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR(1), Washington, DC, USA, pp. 478–485.Search in Google Scholar

Fan, M., Zhang, X., Chen, S., Bao, H. and Maybank, S.J. (2013). Dimension estimation of image manifolds by minimal cover approximation, Neurocomputing105: 19–29.10.1016/j.neucom.2012.04.037Search in Google Scholar

Fukunaga, K. (1982). Intrinsic dimensionality extraction, in P. Krishnaiah and L. Kanal (Eds.), Classification, Pattern Recognition and Reduction of Dimensionality, Handbook of Statistics, Vol. 2, North-Holland, Amsterdam, pp. 347–362.10.1016/S0169-7161(82)02018-5Search in Google Scholar

Fukunaga, K. and Olsen, D. (1971). An algorithm for finding intrinsic dimensionality of data, IEEE Transactions on Computers20(2): 176–183.10.1109/T-C.1971.223208Search in Google Scholar

Gong, S., Cristani, M., Yan, S. and Loy, C.C. (Eds.) (2014). Person Re-Identification, Advances in Computer Vision and Pattern Recognition, Vol. XVIII, Springer, London.Search in Google Scholar

Grassberger, P. and Procaccia, I. (1983). Measuring the strangeness of strange attractors, Physica D: Nonlinear Phenomena9(1–2): 189–208.10.1016/0167-2789(83)90298-1Search in Google Scholar

Hadid, A., Kouropteva, O. and Pietikäinen, M. (2002). Unsupervised learning using locally linear embedding: experiments with face pose analysis, 16th International Conference on Pattern Recognition, ICPR’02(1), Quebec City, Quebec, Canada, pp. 111–114.Search in Google Scholar

He, J., Ding, L., Jiang, L., Li, Z. and Hu, Q. (2014). Intrinsic dimensionality estimation based on manifold assumption, Journal of Visual Communication and Image Representation25(5): 740–747.10.1016/j.jvcir.2014.01.006Search in Google Scholar

Hein, M. and Audibert, J. (2005). Intrinsic dimensionality estimation of submanifolds in rd, Machine Learning: Proceedings of the 22nd International Conference (ICML 2005), Bonn, Germany, pp. 289–296.Search in Google Scholar

Jenkins, O.C. and Mataric, M.J. (2004). A spatio-temporal extension to isomap nonlinear dimension reduction, 21st International Conference on Machine Learning, ICML(69), Banff, Alberta, Canada, pp. 441–448.Search in Google Scholar

Karbauskaitė, R. and Dzemyda, G. (2009). Topology preservation measures in the visualization of manifold-type multidimensional data, Informatica20(2): 235–254.10.15388/Informatica.2009.248Search in Google Scholar

Karbauskaitė, R. and Dzemyda, G. (2014). Geodesic distances in the intrinsic dimensionality estimation using packing numbers, Nonlinear Analysis: Modelling and Control19(4): 578–591.10.15388/NA.2014.4.4Search in Google Scholar

Karbauskaitė, R., Dzemyda, G. and Marcinkevičius, V. (2008). Selecting a regularization parameter in the locally linear embedding algorithm, 20th International EURO Mini Conference on Continuous Optimization and Knowledge-based Technologies (EurOPT2008), Neringa, Lithuania, pp. 59–64.Search in Google Scholar

Karbauskaitė, R., Dzemyda, G. and Marcinkevičius, V. (2010). Dependence of locally linear embedding on the regularization parameter, An Official Journal of the Spanish Society of Statistics and Operations Research18(2): 354–376.10.1007/s11750-010-0151-ySearch in Google Scholar

Karbauskaitė, R., Dzemyda, G. and Mazėtis, E. (2011). Geodesic distances in the maximum likelihood estimator of intrinsic dimensionality, Nonlinear Analysis: Modelling and Control16(4): 387–402.10.15388/NA.16.4.14084Search in Google Scholar

Karbauskaitė, R., Kurasova, O. and Dzemyda, G. (2007). Selection of the number of neighbours of each data point for the locally linear embedding algorithm, Information Technology and Control36(4): 359–364.Search in Google Scholar

Kégl, B. (2003). Intrinsic dimension estimation using packing numbers, Advances in Neural Information Processing Systems, NIPS(15), Cambridge, MA, USA, pp. 697–704.Search in Google Scholar

Kouropteva, O., Okun, O. and Pietikäinen, M. (2002). Selection of the optimal parameter value for the locally linear embedding algorithm, 1st International Conference on Fuzzy Systems and Knowledge Discovery, FSKD(1), Singapore, pp. 359–363.Search in Google Scholar

Kulczycki, P. and Łukasik, S. (2014). An algorithm for reducing the dimension and size of a sample for data exploration procedures, International Journal of Applied Mathematics and Computer Science24(1): 133–149, DOI: 10.2478/amcs-2014-0011.10.2478/amcs-2014-0011Search in Google Scholar

Lee, J.A. and Verleysen, M. (2007). Nonlinear Dimensionality Reduction, Springer, New York, NY.10.1007/978-0-387-39351-3Search in Google Scholar

Levina, E. and Bickel, P.J. (2005). Maximum likelihood estimation of intrinsic dimension, in L.K. Saul, Y. Weiss and L. Bottou (Eds.), Advances in Neural Information Processing Systems 17, MIT Press, Cambridge, MA, pp. 777–784.Search in Google Scholar

Levina, E., Wagaman, A.S., Callender, A.F., Mandair, G.S. and Morris, M.D. (2007). Estimating the number of pure chemical components in a mixture by maximum likelihood, Journal of Chemometrics21(1–2): 24–34.10.1002/cem.1027Search in Google Scholar

Li, S. Z., Xiao, R., Li, Z. and Zhang, H. (2001). Nonlinear mapping from multi-view face patterns to a Gaussian distribution in a low dimensional space, IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems (RATFG-RTS), Vancouver, BC, Canada, pp. 47–54.Search in Google Scholar

Mo, D. and Huang, S.H. (2012). Fractal-based intrinsic dimension estimation and its application in dimensionality reduction, IEEE Transactions on Knowledge and Data Engineering24(1): 59–71.10.1109/TKDE.2010.225Search in Google Scholar

Nene, S.A., Nayar, S.K. and Murase, H. (1996). Columbia object image library (COIL-20), Technical Report CUCS-005-96, Columbia University, New York, NY.Search in Google Scholar

Niskanen, M. and Silven, O. (2003). Comparison of dimensionality reduction methods for wood surface inspection, 6th International Conference on Quality Control by Artificial Vision, QCAV(5132), Gatlinburg, TN, USA, pp. 178–188.Search in Google Scholar

Qiao, M.F.H. and Zhang, B. (2009). Intrinsic dimension estimation of manifolds by incising balls, Pattern Recognition42(5): 780–787.10.1016/j.patcog.2008.09.016Search in Google Scholar

Roweis, S.T. and Saul, L.K. (2000). Nonlinear dimensionality reduction by locally linear embedding, Science290(5500): 2323–2326.10.1126/science.290.5500.232311125150Search in Google Scholar

Saul, L.K. and Roweis, S.T. (2003). Think globally, fit locally: Unsupervised learning of low dimensional manifolds, Journal of Machine Learning Research4: 119–155.Search in Google Scholar

Shin, Y.J. and Park, C.H. (2011). Analysis of correlation based dimension reduction methods, International Journal of Applied Mathematics and Computer Science21(3): 549–558, DOI: 10.2478/v10006-011-0043-9.10.2478/v10006-011-0043-9Search in Google Scholar

Tenenbaum, J.B., de Silva, V. and Langford, J.C. (2000). A global geometric framework for nonlinear dimensionality reduction, Science290(5500): 2319–2323.10.1126/science.290.5500.231911125149Search in Google Scholar

van der Maaten, L.J.P. (2007). An introduction to dimensionality reduction using MATLAB, Technical Report MICC 07-07, Maastricht University, Maastricht.Search in Google Scholar

Varini, C., Nattkemper, T. W., Degenhard, A. and Wismuller, A. (2004). Breast MRI data analysis by LLE, Proceedings of the 2004 IEEE International Joint Conference on Neural Networks, Montreal, Canada, Vol. 3, pp. 2449–2454.Search in Google Scholar

Verveer, P. and Duin, R. (1995). An evaluation of intrinsic dimensionality estimators, IEEE Transactions on Pattern Analysis and Machine Intelligence17(1): 81–86.10.1109/34.368147Search in Google Scholar

Weinberger, K.Q. and Saul, L.K. (2006). Unsupervised learning of image manifolds by semidefinite programming, International Journal of Computer Vision70(1): 77–90.10.1007/s11263-005-4939-zSearch in Google Scholar

Yang, M.-H. (2002). Face recognition using extended isomap, IEEE International Conference on Image Processing, ICIP(2), Rochester, NY, USA, pp. 117–120.Search in Google Scholar

Yata, K. and Aoshima, M. (2010). Intrinsic dimensionality estimation of high-dimension, low sample size data with d-asymptotics, Communications in Statistics—Theory and Methods39(8–9): 1511–1521.10.1080/03610920903121999Search in Google Scholar

Zhang, J., Li, S.Z. and Wang, J. (2004). Nearest manifold approach for face recognition, 6th IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, South Korea, pp. 223–228.Search in Google Scholar

Zhang, Z. and Zha, H. (2004). Principal manifolds and nonlinear dimensionality reduction via local tangent space alignment, SIAM Journal of Scientific Computing26(1): 313–338.10.1137/S1064827502419154Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo