1. bookVolume 10 (2020): Issue 1 (January 2020)
Journal Details
First Published
30 Dec 2014
Publication timeframe
4 times per year
access type Open Access

On Training Deep Neural Networks Using a Streaming Approach

Published Online: 11 Dec 2019
Page range: 15 - 26
Received: 12 Sep 2019
Accepted: 18 Nov 2019
Journal Details
First Published
30 Dec 2014
Publication timeframe
4 times per year

In recent years, many deep learning methods, allowed for a significant improvement of systems based on artificial intelligence methods. Their effectiveness results from an ability to analyze large labeled datasets. The price for such high accuracy is the long training time, necessary to process such large amounts of data. On the other hand, along with the increase in the number of collected data, the field of data stream analysis was developed. It enables to process data immediately, with no need to store them. In this work, we decided to take advantage of the benefits of data streaming in order to accelerate the training of deep neural networks. The work includes an analysis of two approaches to network learning, presented on the background of traditional stochastic and batch-based methods.


[1] Abdulsalam, H., Martin, P., and Skillicorn, D. S.; Streaming random forests. In 11th International Database Engineering and Applications Symposium (IDEAS 2007), pp. 225–232.Search in Google Scholar

[2] Abdulsalam, H., Skillicorn, D. B., and Martin, P.; Classifying evolving data streams using dynamic streaming random forests. In International Conference on Database and Expert Systems Applications (2008), Springer, pp. 643–651.Search in Google Scholar

[3] Baena-Garcia, M., del Campo-Avila, J., Fidalgo, R., Bifet, A., Gavalda, R., and Morales-Bueno, R.; Early drift detection method. In Fourth International Workshop on Knowledge Discovery from Data Streams (2006).Search in Google Scholar

[4] Bengio, Y.; Learning deep architectures for AI. Foundations and Trends in Machine Learning 2, 1 (2009), 1–127.10.1561/2200000006Open DOISearch in Google Scholar

[5] Bengio, Y., Lamblin, P., Popovici, D., and Larochelle, H.; Greedy layer-wise training of deep networks. In Proceedings of the 19th International Conference on Neural Information Processing Systems (Cambridge, MA, USA, 2006), NIPS’06, MIT Press, pp. 153–160.Search in Google Scholar

[6] Bifet, A., and Gavaldà, R. Adaptive learning from evolving data streams. In International Symposium on Intelligent Data Analysis (2009), Springer, pp. 249–260.Search in Google Scholar

[7] Bodyanskiy, Y., Vynokurova, O., Pliss, I., Setlak, G., and Mulesa, P.; Fast learning algorithm for deep evolving gmdh-svm neural network in data stream mining tasks. In 2016 IEEE First International Conference on Data Stream Mining Processing (DSMP) (Aug 2016), pp. 257–262.Search in Google Scholar

[8] Bologna, G., and Hayashi, Y.; Characterization of symbolic rules embedded in deep dimlp networks: a challenge to transparency of deep learning. Journal of Artificial Intelligence and Soft Computing Research 7, 4 (2017), 265–286.Search in Google Scholar

[9] Chung, J., Gülçehre, Ç., Cho, K., and Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR abs/1412.3555 (2014).Search in Google Scholar

[10] deBarros, R. S. M., Hidalgo, J. I. G., and de Lima Cabral, D. R.; Wilcoxon rank sum test drift detector. Neurocomputing 275 (2018), 1954–1963.Search in Google Scholar

[11] Demsar, J., and Bosnic, Z.; Detecting concept drift in data streams using model explanation. Expert Systems with Applications 92 (2018), 546–559.10.1016/j.eswa.2017.10.003Open DOISearch in Google Scholar

[12] Deng, L., Hinton, G., and Kingsbury, B.; New types of deep neural network learning for speech recognition and related applications: An overview. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (2013), IEEE, pp. 8599–8603.Search in Google Scholar

[13] Ditzler, G., Roveri, M., Alippi, C., and Polikar, R.; Learning in nonstationary environments: A survey. IEEE Computational Intelligence Magazine 10, 4 (2015), 12–25.Search in Google Scholar

[14] Domingos, P., and Hulten, G.; Mining high-speed data streams. In Proc. 6th ACM SIGKDD Internat. Conf. on Knowledge Discovery and Data Mining (2000), pp. 71–80.Search in Google Scholar

[15] Gama, J., Medas, P., Castillo, G., and Rodrigues, P.; Learning with drift detection. In Brazilian Symposium on Artificial Intelligence (2004), Springer, pp. 286–295.Search in Google Scholar

[16] Gers, F. A., and Schmidhuber, J.; Recurrent nets that time and count. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium (July 2000), vol. 3, pp. 189–194 vol.3.Search in Google Scholar

[17] Gomes, H. M., Barddal, J. P., Enembreck, F., and Bifet, A.; A survey on ensemble learning for data stream classification. ACM Computing Surveys (CSUR) 50, 2 (2017), 23.Search in Google Scholar

[18] Gomes, H. M., Bifet, A., Read, J., Barddal, J. P., Enembreck, F., Pfharinger, B., Holmes, G., and Abdessalem, T.; Adaptive random forests for evolving data stream classification. Machine Learning 106, 9-10 (2017), 1469–1495.Search in Google Scholar

[19] Goodfellow, I., Bengio, Y., and Courville, A.; Deep Learning. MIT Press, 2016.Search in Google Scholar

[20] He, K., Zhang, X., Ren, S., and Sun, J.; Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016), pp. 770–778.Search in Google Scholar

[21] Hinton, G. E., Osindero, S., and Teh, Y.-W.; A fast learning algorithm for deep belief nets. Journal of Neural Computation 18, 7 (July 2006), 1527–1554.10.1162/neco.2006.18.7.1527Open DOISearch in Google Scholar

[22] Hinton, G. E., Sejnowski, T. J., and Ackley, D. H.; Boltzmann machines: Constraint satisfaction networks that learn. Tech. Rep. CMU-CS-84-119, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA, 1984.Search in Google Scholar

[23] Hochreiter, S., Bengio, Y., Frasconi, P., and Schmidhuber, J.; Gradient flow in recurrent nets: the difficulty of learning long-term dependencies, 2001.Search in Google Scholar

[24] Hou, Y., and Holder, L. B.; On graph mining with deep learning: Introducing model r for link weight prediction. Journal of Artificial Intelligence and Soft Computing Research 9, 1 (2019), 21–40.10.2478/jaiscr-2018-0022Open DOISearch in Google Scholar

[25] Huang, G., Liu, Z., v. d. Maaten, L., and Weinberger, K. Q.; Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017), pp. 2261–2269.Search in Google Scholar

[26] II, A. G. O., Giles, C. L., and Reitter, D.; Online semi-supervised learning with deep hybrid boltzmann machines and denoising autoencoders. CoRR abs/1511.06964 (2015).Search in Google Scholar

[27] Jaworski, M., Duda, P., and Rutkowski, L.; On applying the Restricted Boltzmann Machine to active concept drift detection. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (Honolulu, USA, 2017), pp. 3512–3519.Search in Google Scholar

[28] Jaworski, M., Duda, P., and Rutkowski, L.; Concept drift detection in streams of labelled data using the Restricted Boltzmann Machine. In 2018 International Joint Conference on Neural Networks (IJCNN) (2018), pp. 1–7.Search in Google Scholar

[29] Jaworski, M., Rutkowski, L., Duda, P., and Cader, A.; Resource-aware data stream mining using the Restricted Boltzmann Machine. In Artificial Intelligence and Soft Computing (Cham, 2019), L. Rutkowski, R. Scherer, M. Korytkowski, W. Pedrycz, R. Tadeusiewicz, and J. M. Zurada, Eds., Springer International Publishing, pp. 15–24.Search in Google Scholar

[30] Kingma, D. P., and Welling, M.; Stochastic gradient vb and the variational auto-encoder. In Second International Conference on Learning Representations, ICLR (2014), vol. 19.Search in Google Scholar

[31] Krawczyk, B., Minku, L. L., Gama, J., Stefanowski, J., and Wozniak, M.; Ensemble learning for data stream analysis: A survey. Information Fusion 37 (2017), 132–156.Search in Google Scholar

[32] Krizhevsky, A., Sutskever, I., and Hinton, G. E.; Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–1105.Search in Google Scholar

[33] LeCun, Y., Bengio, Y., and Hinton, G.; Deep learning. Nature 521, 7553 (2015), 436.Search in Google Scholar

[34] Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P.; Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (Nov 1998), 2278–2324.10.1109/5.726791Open DOISearch in Google Scholar

[35] LeCun, Y., and Cortes, C.; Mnist handwritten digit database (2010); http://yann.lecun.com/exdb/mnist/Search in Google Scholar

[36] Mamoshina, P., Vieira, A., Putin, E., and Zhavoronkov, A.; Applications of deep learning in biomedicine; Molecular pharmaceutics 13, 5 (2016), 1445–14542700797710.1021/acs.molpharmaceut.5b00982Search in Google Scholar

[37] Mello, R. F., Vaz, Y., H.Grossi, C., and Bifet, A.; On learning guarantees to unsupervised concept drift detection on data streams; Expert Systems with Applications 117 (2019), 90–102Search in Google Scholar

[38] Page, E. S.,Continuous inspection schemes; Biometrika 41, 1/2 (1954), 100–11510.1093/biomet/41.1-2.100Open DOISearch in Google Scholar

[39] Read, J., Perez-Cruz, F., and Bifet, A., Deep learning in partially-labeled data streams; In Proceedings of the 30th Annual ACM Symposium on Applied Computing (New York, NY, USA, 2015), SAC ’15, ACM, pp. 954–959Search in Google Scholar

[40] Simonyan, Karen; Zisserman, A., Very deep convolutional networks for large-scale image recognition; eprint arXiv:1409.1556 (2014)Search in Google Scholar

[41] Szegedy, C., Wei Liu, Yangqing Jia, Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A., Going deeper with convolutions, In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015), pp. 1–9Search in Google Scholar

[42] Vincent, P., Larochelle, H., Bengio, Y., and Manzagol, P.-A., Extracting and composing robust features with denoising autoencoders; In Proceedings of the 25th International Conference on Machine Learning (New York, NY, USA, 2008), ICML ’08, ACM, pp. 1096–1103Search in Google Scholar

[43] Zeiler, M. D., Adadelta: an adaptive learning rate method; arXiv preprint arXiv:1212.5701 (2012)Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo