1. bookVolume 31 (2021): Issue 4 (December 2021)
    Advanced Machine Learning Techniques in Data Analysis (special section, pp. 549-611), Maciej Kusy, Rafał Scherer, and Adam Krzyżak (Eds.)
Journal Details
License
Format
Journal
eISSN
2083-8492
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
access type Open Access

A weighted wrapper approach to feature selection

Published Online: 30 Dec 2021
Page range: 685 - 696
Received: 24 Feb 2021
Accepted: 04 Oct 2021
Journal Details
License
Format
Journal
eISSN
2083-8492
First Published
05 Apr 2007
Publication timeframe
4 times per year
Languages
English
Abstract

This paper considers feature selection as a problem of an aggregation of three state-of-the-art filtration methods: Pearson’s linear correlation coefficient, the ReliefF algorithm and decision trees. A new wrapper method is proposed which, on the basis of a fusion of the above approaches and the performance of a classifier, is capable of creating a distinct, ordered subset of attributes that is optimal based on the criterion of the highest classification accuracy obtainable by a convolutional neural network. The introduced feature selection uses a weighted ranking criterion. In order to evaluate the effectiveness of the solution, the idea is compared with sequential feature selection methods that are widely known and used wrapper approaches. Additionally, to emphasize the need for dimensionality reduction, the results obtained on all attributes are shown. The verification of the outcomes is presented in the classification tasks of repository data sets that are characterized by a high dimensionality. The presented conclusions confirm that it is worth seeking new solutions that are able to provide a better classification result while reducing the number of input features.

Keywords

Abdel-Hamid, O., Mohamed, A.-R., Jiang, H., Deng, L., Penn, G. and Yu, D. (2014a). Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(10): 1533–1545.10.1109/TASLP.2014.2339736 Search in Google Scholar

Abdel-Hamid, O., Mohamed, A.-R., Jiang, H., Deng, L., Penn, G. and Yu, D. (2014b). Convolutional neural networks for speech recognition, IEEE/ACM Transactions on Audio, Speech, and Language Processing 22(10): 1533–1545.10.1109/TASLP.2014.2339736 Search in Google Scholar

Abdeljaber, O., Avci, O., Kiranyaz, M.S., Boashash, B., Sodano, H. and Inman, D.J. (2018). 1-D CNNs for structural damage detection: Verification on a structural health monitoring benchmark data, Neurocomputing 275: 1308–1317.10.1016/j.neucom.2017.09.069 Search in Google Scholar

Abdeljaber, O., Avci, O., Kiranyaz, S., Gabbouj, M. and Inman, D.J. (2017). Real-time vibration-based structural damage detection using one-dimensional convolutional neural networks, Journal of Sound and Vibration 388: 154–170.10.1016/j.jsv.2016.10.043 Search in Google Scholar

Awada, W., Khoshgoftaar, T.M., Dittman, D., Wald, R. and Napolitano, A. (2012). A review of the stability of feature selection techniques for bioinformatics data, IEEE 13th International Conference on Information Reuse & Integration (IRI), Las Vegas, USA, pp. 356–363. Search in Google Scholar

Azizjon, M., Jumabek, A. and Kim, W. (2020). 1D CNN based network intrusion detection with normalization on imbalanced data, International Conference on Artificial Intelligence in Information and Communication (ICAIIC), Fukuoka, Japan, pp. 218–224. Search in Google Scholar

Benesty, J., Chen, J., Huang, Y. and Cohen, I. (2009). Pearson correlation coefficient, in J. Benesty and W. Kellermann (Eds.), Noise Reduction in Speech Processing, Springer Topics in Signal Processing, Springer, Berlin, pp. 1–4.10.1007/978-3-642-00296-0_5 Search in Google Scholar

Bolón-Canedo, V., Sánchez-Maroño, N. and Alonso-Betanzos, A. (2013). A review of feature selection methods on synthetic data, Knowledge and Information Systems 34(3): 483–519.10.1007/s10115-012-0487-8 Search in Google Scholar

Breiman, L., Friedman, J., Olshen, R. and Stone, C. (1984). Classification and Regression Trees, CRC Press, Boca Raton. Search in Google Scholar

Broughton, R., Coope, I., Renaud, P. and Tappenden, R. (2010). Determinant and exchange algorithms for observation subset selection, IEEE Transactions on Image Processing 19(9): 2437–2443.10.1109/TIP.2010.2048150 Search in Google Scholar

Cannas, L.M., Dessì, N. and Pes, B. (2013). Assessing similarity of feature selection techniques in high-dimensional domains, Pattern Recognition Letters 34(12): 1446–1453.10.1016/j.patrec.2013.05.011 Search in Google Scholar

Devijver, P. and Kittler, I. (1982). Pattern Recognition: A Statistical Approach, Prentice-Hall, Englewood Cliffs. Search in Google Scholar

Dua, D. and Graff, C. (2017). UCI Machine Learning Repository, http://archive.ics.uci.edu/ml. Search in Google Scholar

El Aboudi, N. and Benhlima, L. (2016). Review on wrapper feature selection approaches, International Conference on Engineering & MIS (ICEMIS), Agadir, Morocco, pp. 1–5. Search in Google Scholar

Eren, L. (2017). Bearing fault detection by one-dimensional convolutional neural networks, Mathematical Problems in Engineering 2017: 1–9.10.1155/2017/8617315 Search in Google Scholar

Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection, Journal of Machine Learning Research 3: 1157–1182. Search in Google Scholar

Hajj, N., Rizk, Y. and Awad, M. (2019). A subjectivity classification framework for sports articles using cortical algorithms for feature selection, Neural Computing and Applications 31: 8069–8085.10.1007/s00521-018-3549-3 Search in Google Scholar

Kiranyaz, S., Avci, O., Abdeljaber, O., Ince, T., Gabbouj, M. and Inman, D.J. (2021). 1D convolutional neural networks and applications: A survey, Mechanical Systems and Signal Processing 151: 107398.10.1016/j.ymssp.2020.107398 Search in Google Scholar

Kiranyaz, S., Ince, T. and Gabbouj, M. (2015a). Real-time patient-specific ECG classification by 1-D convolutional neural networks, IEEE Transactions on Biomedical Engineering 63(3): 664–675.10.1109/TBME.2015.2468589 Search in Google Scholar

Kiranyaz, S., Ince, T. and Gabbouj, M. (2015b). Real-time patient-specific ECG classification by 1-D convolutional neural networks, IEEE Transactions on Biomedical Engineering 63(3): 664–675.10.1109/TBME.2015.2468589 Search in Google Scholar

Kohavi, R. and John, G.H. (1997). Wrappers for feature subset selection, Artificial Intelligence 97(1): 273–324.10.1016/S0004-3702(97)00043-X Search in Google Scholar

Koziarski, M. and Cyganek, B. (2018). Impact of low resolution on image recognition with deep neural networks: An experimental study, International Journal of Applied Mathematics and Computer Science 28(4): 735–744, DOI: 10.2478/amcs-2018-0056.10.2478/amcs-2018-0056 Search in Google Scholar

Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2017). Imagenet classification with deep convolutional neural networks, Communications of the ACM 60(6): 84–90.10.1145/3065386 Search in Google Scholar

Kusy, M., Zajdel, R., Kluska, J. and Zabinski, T. (2020). Fusion of feature selection methods for improving model accuracy in the milling process data classification problem, International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, pp. 1–8. Search in Google Scholar

LeCun, Y., Bengio, Y. and Hinton, G. (2015). Deep learning, Nature 521(7553): 436–444.10.1038/nature14539 Search in Google Scholar

LeCun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998). Gradient-based learning applied to document recognition, Proceedings of the IEEE 86(11): 2278–2324.10.1109/5.726791 Search in Google Scholar

Li, Y., Hsu, D.F. and Chung, S.M. (2013). Combination of multiple feature selection methods for text categorization by using combinatorial fusion analysis and rank-score characteristic, International Journal on Artificial Intelligence Tools 22(02): 1350001.10.1142/S0218213013500012 Search in Google Scholar

Lu, J., Zhao, T. and Zhang, Y. (2008). Feature selection based-on genetic algorithm for image annotation, Knowledge-Based Systems 21(8): 887–891.10.1016/j.knosys.2008.03.051 Search in Google Scholar

Mansouri, K., Ringsted, T., Ballabio, D., Todeschini, R. and Consonni, V. (2013). Quantitative structure–activity relationship models for ready biodegradability of chemicals, Journal of Chemical Information and Modeling 53(4): 867–878.10.1021/ci4000213 Search in Google Scholar

Narendra, P.M. and Fukunaga, K. (1977). A branch and bound algorithm for feature subset selection, IEEE Transactions on Computers 26(09): 917–922.10.1109/TC.1977.1674939 Search in Google Scholar

Pes, B. (2020). Ensemble feature selection for high-dimensional data: A stability analysis across multiple domains, Neural Computing and Applications 32(10): 5951–5973.10.1007/s00521-019-04082-3 Search in Google Scholar

Robnik-Šikonja, M. and Kononenko, I. (2003). Theoretical and empirical analysis of ReliefF and RReliefF, Machine Learning 53(1–2): 23–69.10.1023/A:1025667309714 Search in Google Scholar

Rodrigues, D., Pereira, L.A., Nakamura, R.Y., Costa, K.A., Yang, X.-S., Souza, A.N. and Papa, J.P. (2014). A wrapper approach for feature selection based on bat algorithm and optimum-path forest, Expert Systems with Applications 41(5): 2250–2258.10.1016/j.eswa.2013.09.023 Search in Google Scholar

Rokach, L., Chizi, B. and Maimon, O. (2006). Feature selection by combining multiple methods, in M. Last et al. (Eds), Advances in Web Intelligence and Data Mining, Springer, Berlin/Heidelberg, pp. 295–304.10.1007/3-540-33880-2_30 Search in Google Scholar

Russell, S. and Norvig, P. (1995). Artificial Intelligence: A Modern Approach, Prentice Hall, Englewood Cliffs. Search in Google Scholar

Scherer, D., Müller, A. and Behnke, S. (2010). Evaluation of pooling operations in convolutional architectures for object recognition, International Conference on Artificial Neural Networks, Thessaloniki, Greece, pp. 92–101. Search in Google Scholar

Vergara, J.R. and Estévez, P.A. (2014). A review of feature selection methods based on mutual information, Neural Computing and Applications 24(1): 175–186.10.1007/s00521-013-1368-0 Search in Google Scholar

Wang, Y., Zhang, D. and Dai, G. (2020). Classification of high resolution satellite images using improved U-Net, International Journal of Applied Mathematics and Computer Science 30(3): 399–413, DOI: 10.34768/amcs-2020-0030. Search in Google Scholar

Whitney, A.W. (1971). A direct method of nonparametric measurement selection, IEEE Transactions on Computers 100(9): 1100–1103.10.1109/T-C.1971.223410 Search in Google Scholar

Wuniri, Q., Huangfu, W., Liu, Y., Lin, X., Liu, L. and Yu, Z. (2019). A generic-driven wrapper embedded with feature-type-aware hybrid Bayesian classifier for breast cancer classification, IEEE Access 7: 119931–119942.10.1109/ACCESS.2019.2932505 Search in Google Scholar

Zajdel, R., Kusy, M., Kluska, J. and Zabinski, T. (2020). Weighted feature selection method for improving decisions in milling process diagnosis, in L. Rutkowski et al. (Eds), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 12415, Part I, Springer, Cham, pp. 280–291.10.1007/978-3-030-61401-0_27 Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo