Otwarty dostęp

A Novel Drift Detection Algorithm Based on Features’ Importance Analysis in a Data Streams Environment


Zacytuj

[1] P. Duda, M. Jaworski, L. Pietruczuk, and L. Rutkowski, A novel application of Hoeffding’s inequality to decision trees construction for data streams, in Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE, 2014, pp. 3324–3330.10.1109/IJCNN.2014.6889806Search in Google Scholar

[2] L. Rutkowski, L. Pietruczuk, P. Duda, and M. Jaworski, Decision trees for mining data streams based on the McDiarmid’s bound, IEEE Transactions on Knowledge and Data Engineering, vol. 25, no. 6, pp. 1272–1279, 2013.Search in Google Scholar

[3] L. Rutkowski, M. Jaworski, L. Pietruczuk, and P. Duda, Decision trees for mining data streams based on the Gaussian approximation, IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 1, pp. 108–119, 2014.10.1109/TKDE.2013.34Search in Google Scholar

[4] L. Rutkowski, M. Jaworski, L. Pietruczuk, and P. Duda, The CART decision tree for mining data streams, Information Sciences, vol. 266, pp. 1–15, 2014.10.1016/j.ins.2013.12.060Search in Google Scholar

[5] L. Pietruczuk, L. Rutkowski, M. Jaworski, and P. Duda, The parzen kernel approach to learning in non-stationary environment, in Neural Networks (IJCNN), 2014 International Joint Conference on. IEEE, 2014, pp. 3319–3323.10.1109/IJCNN.2014.6889805Search in Google Scholar

[6] L. Rutkowski, M. Jaworski, L. Pietruczuk, and P. Duda, A new method for data stream mining based on the misclassification error, IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 5, pp. 1048–1059, 2015.Search in Google Scholar

[7] P. Duda, M. Jaworski, and L. Rutkowski, Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks, Information Sciences,, 2017.10.1016/j.ins.2017.07.013Search in Google Scholar

[8] M. Jaworski, P. Duda, and L. Rutkowski, New splitting criteria for decision trees in stationary data streams, IEEE Transactions on Neural Networks and Learning Systems, vol. PP, no. 99, pp. 1–14, 2017.Search in Google Scholar

[9] M. Jaworski, P. Duda, L. Rutkowski, P. Najgebauer, and M. Pawlak, Heuristic regression function estimation methods for data streams with concept drift, in Lecture Notes in Computer Science. Springer, 2017, pp. 726–737.10.1007/978-3-319-59060-8_65Search in Google Scholar

[10] M. Jaworski, P. Duda, and L. Rutkowski, On applying the restricted boltzmann machine to active concept drift detection, in Computational Intelligence (SSCI), 2017 IEEE Symposium Series on. IEEE, 2017, pp. 1–8.10.1109/SSCI.2017.8285409Search in Google Scholar

[11] M. Jaworski, Regression function and noise variance tracking methods for data streams with concept drift, International Journal of Applied Mathematics and Computer Science, vol. 28, no. 3, pp. 559–567, 2018.10.2478/amcs-2018-0043Search in Google Scholar

[12] P. Duda, M. Jaworski, and L. Rutkowski, Convergent time-varying regression models for data streams: Tracking concept drift by the recursive parzen-based generalized regression neural networks, International Journal of Neural Systems, vol. 28, no. 02, p. 1750048, 2018.Search in Google Scholar

[13] P. Duda, M. Jaworski, A. Cader, and L. Wang, On training deep neural networks using a streaming approach, Journal of Artificial Intelligence and Soft Computing Research, vol. 10, no. 1, 2020.10.2478/jaiscr-2020-0002Search in Google Scholar

[14] A. Lall, V. Sekar, M. Ogihara, J. Xu, and H. Zhang, Data streaming algorithms for estimating entropy of network traffic, in ACM SIGMETRICS Performance Evaluation Review, vol. 34, no. 1. ACM, 2006, pp. 145–156.10.1145/1140103.1140295Search in Google Scholar

[15] C. Phua, V. Lee, K. Smith, and R. Gayler, A comprehensive survey of data mining-based fraud detection research, arXiv preprint arXiv:1009.6119, 2010.Search in Google Scholar

[16] A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, and G. Bontempi, Credit card fraud detection: A realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems, vol. 29, no. 8, p. 3784–3797, August 2018.Search in Google Scholar

[17] S. Disabato and M. Roveri, Learning convolutional neural networks in presence of concept drift, in 2019 International Joint Conference on Neural Networks (IJCNN), 2019, pp. 1–8.10.1109/IJCNN.2019.8851731Search in Google Scholar

[18] W. N. Street and Y. Kim, A streaming ensemble algorithm (sea) for large-scale classification, in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2001, pp. 377–382.10.1145/502512.502568Search in Google Scholar

[19] N. C. Oza, Online bagging and boosting, in Systems, man and cybernetics, 2005 IEEE international conference on, vol. 3. IEEE, 2005, pp. 2340–2345.Search in Google Scholar

[20] P. Duda, On ensemble components selection in data streams scenario with gradual concept-drift, in International Conference on Artificial Intelligence and Soft Computing. Springer, 2018, pp. 311–320.10.1007/978-3-319-91262-2_28Search in Google Scholar

[21] P. Duda, M. Jaworski, and L. Rutkowski, On ensemble components selection in data streams scenario with reoccurring concept-drift, in 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2017, pp. 1–7.10.1109/SSCI.2017.8285362Search in Google Scholar

[22] L. Pietruczuk, L. Rutkowski, M. Jaworski, and P. Duda, A method for automatic adjustment of ensemble size in stream data mining, in Neural Networks (IJCNN), 2016 International Joint Conference on. IEEE, 2016, pp. 9–15.10.1109/IJCNN.2016.7727174Search in Google Scholar

[23] L. Pietruczuk, L. Rutkowski, M. Jaworski, and P. Duda, How to adjust an ensemble size in stream data mining? Information Sciences, vol. 381, pp. 46–54, 2017.10.1016/j.ins.2016.10.028Search in Google Scholar

[24] G. Ditzler, M. Roveri, C. Alippi, and R. Polikar, Learning in nonstationary environments: A survey, IEEE Computational Intelligence Magazine, vol. 10, no. 4, pp. 12–25, 2015.10.1109/MCI.2015.2471196Search in Google Scholar

[25] P. Duda, L. Rutkowski, M. Jaworski, and D. Rutkowska, On the Parzen kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification, IEEE transactions on cybernetics, vol 50, no. 4, pp. 1683-1696, 2020.Search in Google Scholar

[26] E. Rafajlowicz, W. Rafajlowicz, Testing (non-) linearity of distributed-parameter systems from a video sequence, Asian Journal of Control, Vol. 12, no. 2, pp. 146–158, 2010.10.1002/asjc.172Search in Google Scholar

[27] E. Rafajlowicz, H. Pawlak-Kruczek, W. Rafajlowicz, Statistical Classifier with Ordered Decisions as an Image Based Controller with Application to Gas Burners, Springer, Lecture Notes in Artificial Intelligence, vol. 8467, pp. 586–597, 2014.Search in Google Scholar

[28] E. Rafajlowicz, W. Rafajlowicz, Iterative learning in optimal control of linear dynamic processes, International Journal Of Control, vol. 91, no. 7, pp. 1522–1540, 2018.Search in Google Scholar

[29] P. Jurewicz, W. Rafajlowicz, J. Reiner, et al., Simulations for Tuning a Laser Power Control System of the Cladding Process, Lecture Notes in Computer Science, vol. 9842, pp. 218–229, Springer, 2016.Search in Google Scholar

[30] E. Rafajlowicz, W. Rafajlowicz, Iterative Learning in Repetitive Optimal Control of Linear Dynamic Processes, 15th International Conference on Artificial Intelligence and Soft Computing (ICAISC), 2016, Springer, vol. 9692, pp. 705–717, 2016.Search in Google Scholar

[31] E. Rafajlowicz, W. Rafajlowicz, Control of linear extended nD systems with minimized sensitivity to parameter uncertainties, Multidimensional Systems And Signal Processing, vol. 24, no. 4, pp. 637–656, 2013.10.1007/s11045-013-0236-5Search in Google Scholar

[32] S. A. Ludwig, Applying a neural network ensemble to intrusion detection, Journal of Artificial Intelligence and Soft Computing Research, vol. 9, no. 3, pp. 177–188, 2019.10.2478/jaiscr-2019-0002Search in Google Scholar

[33] H. Wang, W. Fan, P. S. Yu, and J. Han, Mining concept-drifting data streams using ensemble classifiers, in Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining. AcM, 2003, pp. 226–235.10.1145/956750.956778Search in Google Scholar

[34] R. Polikar, L. Upda, S. S. Upda, and V. Honavar, Learn++: An incremental learning algorithm for supervised neural networks, IEEE transactions on systems, man, and cybernetics, part C (applications and reviews), vol. 31, no. 4, pp. 497–508, 2001.10.1109/5326.983933Search in Google Scholar

[35] R. Elwell and R. Polikar, Incremental learning of concept drift in nonstationary environments, IEEE Transactions on Neural Networks, vol. 22, no. 10, pp. 1517–1531, 2011.Search in Google Scholar

[36] A. Beygelzimer, S. Kale, and H. Luo, Optimal and adaptive algorithms for online boosting, in Proceedings of the 32nd International Conference on Machine Learning (ICML-15), 2015, pp. 2323–2331.Search in Google Scholar

[37] H. M. Gomes, J. P. Barddal, F. Enembreck, and A. Bifet, A survey on ensemble learning for data stream classification, ACM Computing Surveys (CSUR), vol. 50, no. 2, p. 23, 2017.10.1145/3054925Search in Google Scholar

[38] B. Krawczyk, L. L. Minku, J. Gama, J. Stefanowski, and M. Wozniak, Ensemble learning for data stream analysis: A survey, Information Fusion, vol. 37, pp. 132–156, 2017.10.1016/j.inffus.2017.02.004Search in Google Scholar

[39] L. Breiman, Random forests, Machine learning, vol. 45, no. 1, pp. 5–32, 2001.10.1023/A:1010933404324Search in Google Scholar

[40] H. Abdulsalam, D. B. Skillicorn, and P. Martin, Classifying evolving data streams using dynamic streaming random forests, in International Conference on Database and Expert Systems Applications. Springer, 2008, pp. 643–651.10.1007/978-3-540-85654-2_54Search in Google Scholar

[41] H. Abdulsalam, P. Martin, and D. Skillicorn, Streaming random forests, 2008.10.1109/IDEAS.2007.4318108Search in Google Scholar

[42] H. M. Gomes, A. Bifet, J. Read, J. P. Barddal, F. Enembreck, B. Pfharinger, G. Holmes, and T. Abdessalem, Adaptive random forests for evolving data stream classification, Machine Learning, vol. 106, no. 9-10, pp. 1469–1495, 2017.Search in Google Scholar

[43] P. Domingos and G. Hulten, Mining high-speed data streams, in Proc. 6th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining, 2000, pp. 71–80.10.1145/347090.347107Search in Google Scholar

[44] A. Bifet and R. Gavaldà, Adaptive learning from evolving data streams, in International Symposium on Intelligent Data Analysis. Springer, 2009, pp. 249–260.10.1007/978-3-642-03915-7_22Search in Google Scholar

[45] E. S. Page, Continuous inspection schemes, Biometrika, vol. 41, no. 1/2, pp. 100–115, 1954.10.1093/biomet/41.1-2.100Search in Google Scholar

[46] J. P. Barddal, H. M. Gomes, F. Enembreck, and B. Pfahringer, A survey on feature drift adaptation: Definition, benchmark, challenges and future directions, Journal of Systems and Software, 07 2016.10.1016/j.jss.2016.07.005Search in Google Scholar

[47] H.-L. Nguyen, Y.-K. Woon, W.-K. Ng, and L. Wan, Heterogeneous ensemble for feature drifts in data streams, in Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 2012, pp. 1–12.10.1007/978-3-642-30220-6_1Search in Google Scholar

[48] A. P. Cassidy and F. A. Deviney, Calculating feature importance in data streams with concept drift using online random forest, in 2014 IEEE International Conference on Big Data (Big Data). IEEE, 2014, pp. 23–28.10.1109/BigData.2014.7004352Search in Google Scholar

[49] R. Zhu, D. Zeng, and M. R. Kosorok, Reinforcement learning trees, Journal of the American Statistical Association, vol. 110, no. 512, pp. 1770–1784, 2015.Search in Google Scholar

[50] L. Yuan, B. Pfahringer, and J. P. Barddal, Iterative subset selection for feature drifting data streams, in Proceedings of the 33rd Annual ACM Symposium on Applied Computing. ACM, 2018, pp. 510–517.10.1145/3167132.3167188Search in Google Scholar

[51] L. C. Molina, L. Belanche, and À. Nebot, Feature selection algorithms: A survey and experimental evaluation, in 2002 IEEE International Conference on Data Mining, 2002. Proceedings. IEEE, 2002, pp. 306–313.Search in Google Scholar

[52] G. Ditzler, J. LaBarck, J. Ritchie, G. Rosen, and R. Polikar, Extensions to online feature selection using bagging and boosting, IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 9, pp. 4504–4509, 2018.Search in Google Scholar

[53] J. P. Barddal, H. M. Gomes, F. Enembreck, and B. Pfahringer, A survey on feature drift adaptation: Definition, benchmark, challenges and future directions, Journal of Systems and Software, 07 2016.10.1016/j.jss.2016.07.005Search in Google Scholar

[54] J. Gama, P. Medas, G. Castillo, and P. Rodrigues, Learning with drift detection, in Brazilian symposium on artificial intelligence. Springer, 2004, pp. 286–295.10.1007/978-3-540-28645-5_29Search in Google Scholar

eISSN:
2083-2567
Język:
Angielski
Częstotliwość wydawania:
4 razy w roku
Dziedziny czasopisma:
Computer Sciences, Databases and Data Mining, Artificial Intelligence