rss_2.0Journal of Artificial Intelligence and Soft Computing Research FeedSciendo RSS Feed for Journal of Artificial Intelligence and Soft Computing Research of Artificial Intelligence and Soft Computing Research 's Cover Convolutional Network to Solving Connected Text Captcha<abstract> <title style='display:none'>Abstract</title> <p>Text-based CAPTCHA is a convenient and effective safety mechanism that has been widely deployed across websites. The efficient end-to-end models of scene text recognition consisting of CNN and attention-based RNN show limited performance in solving text-based CAPTCHAs. In contrast with the street view image and document, the character sequence in CAPTCHA is non-semantic. The RNN loses its ability to learn the semantic context and only implicitly encodes the relative position of extracted features. Meanwhile, the security features, which prevent characters from segmentation and recognition, extensively increase the complexity of CAPTCHAs. The performance of this model is sensitive to different CAPTCHA schemes. In this paper, we analyze the properties of the text-based CAPTCHA and accordingly consider solving it as a highly position-relative character sequence recognition task. We propose a network named PosConv to leverage the position information in the character sequence without RNN. PosConv uses a novel padding strategy and modified convolution, explicitly encoding the relative position into the local features of characters. This mechanism of PosConv makes the extracted features from CAPTCHAs more informative and robust. We validate PosConv on six text-based CAPTCHA schemes, and it achieves state-of-the-art or competitive recognition accuracy with significantly fewer parameters and faster convergence speed.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00A Progressive and Cross-Domain Deep Transfer Learning Framework for Wrist Fracture Detection<abstract> <title style='display:none'>Abstract</title> <p>There has been an amplified focus on and benefit from the adoption of artificial intelligence (AI) in medical imaging applications. However, deep learning approaches involve training with massive amounts of annotated data in order to guarantee generalization and achieve high accuracies. Gathering and annotating large sets of training images require expertise which is both expensive and time-consuming, especially in the medical field. Furthermore, in health care systems where mistakes can have catastrophic consequences, there is a general mistrust in the black-box aspect of AI models. In this work, we focus on improving the performance of medical imaging applications when limited data is available while focusing on the interpretability aspect of the proposed AI model. This is achieved by employing a novel transfer learning framework, <italic>progressive transfer learning</italic>, an automated annotation technique and a correlation analysis experiment on the learned representations.</p> <p><italic>Progressive transfer learning</italic> helps jump-start the training of deep neural networks while improving the performance by gradually transferring knowledge from two source tasks into the target task. It is empirically tested on the wrist fracture detection application by first training a general radiology network <italic>RadiNet</italic> and using its weights to initialize <italic>RadiNet<sub>wrist</sub></italic>, that is trained on wrist images to detect fractures. Experiments show that <italic>RadiNet<sub>wrist</sub></italic> achieves an accuracy of 87% and an AUC ROC of 94% as opposed to 83% and 92% when it is pre-trained on the ImageNet dataset.</p> <p>This improvement in performance is investigated within an <italic>explainable AI</italic> framework. More concretely, the learned deep representations of <italic>RadiNet<sub>wrist</sub></italic> are compared to those learned by the baseline model by conducting a correlation analysis experiment. The results show that, when transfer learning is <italic>gradually</italic> applied, some features are learned earlier in the network. Moreover, the deep layers in the <italic>progressive transfer learning</italic> framework are shown to encode features that are not encountered when traditional transfer learning techniques are applied.</p> <p>In addition to the empirical results, a clinical study is conducted and the performance of <italic>RadiNet<sub>wrist</sub></italic> is compared to that of an expert radiologist. We found that <italic>RadiNet<sub>wrist</sub></italic> exhibited similar performance to that of radiologists with more than 20 years of experience.</p> <p>This motivates follow-up research to train on more data to feasibly surpass radiologists’ performance, and investigate the interpretability of AI models in the healthcare domain where the decision-making process needs to be credible and transparent.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Machine Learning and Traditional Econometric Models: A Systematic Mapping Study<abstract> <title style='display:none'>Abstract</title> <p><italic>Context</italic>: Machine Learning (ML) is a disruptive concept that has given rise to and generated interest in different applications in many fields of study. The purpose of Machine Learning is to solve real-life problems by automatically learning and improving from experience without being explicitly programmed for a specific problem, but for a generic type of problem. This article approaches the different applications of ML in a series of econometric methods.</p> <p><italic>Objective</italic>: The objective of this research is to identify the latest applications and do a comparative study of the performance of econometric and ML models. The study aimed to find empirical evidence for the performance of ML algorithms being superior to traditional econometric models. The Methodology of systematic mapping of literature has been followed to carry out this research, according to the guidelines established by [39], and [58] that facilitate the identification of studies published about this subject.</p> <p><italic>Results</italic>: The results show, that in most cases ML outperforms econometric models, while in other cases the best performance has been achieved by combining traditional methods and ML applications.</p> <p><italic>Conclusion</italic>: inclusion and exclusions criteria have been applied and 52 articles closely related articles have been reviewed. The conclusion drawn from this research is that it is a field that is growing, which is something that is well known nowadays and that there is no certainty as to the performance of ML being always superior to that of econometric models.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00An Autoencoder-Enhanced Stacking Neural Network Model for Increasing the Performance of Intrusion Detection<abstract> <title style='display:none'>Abstract</title> <p>Security threats, among other intrusions affecting the availability, confidentiality and integrity of IT resources and services, are spreading fast and can cause serious harm to organizations. Intrusion detection has a key role in capturing intrusions. In particular, the application of machine learning methods in this area can enrich the intrusion detection efficiency. Various methods, such as pattern recognition from event logs, can be applied in intrusion detection. The main goal of our research is to present a possible intrusion detection approach using recent machine learning techniques. In this paper, we suggest and evaluate the usage of stacked ensembles consisting of neural network (SNN) and autoen-coder (AE) models augmented with a tree-structured Parzen estimator hyperparameter optimization approach for intrusion detection. The main contribution of our work is the application of advanced hyperparameter optimization and stacked ensembles together.</p> <p>We conducted several experiments to check the effectiveness of our approach. We used the NSL-KDD dataset, a common benchmark dataset in intrusion detection, to train our models. The comparative results demonstrate that our proposed models can compete with and, in some cases, outperform existing models.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Handling Realistic Noise in Multi-Agent Systems with Self-Supervised Learning and Curiosity<abstract> <title style='display:none'>Abstract</title> <p><sup>1</sup>Most reinforcement learning benchmarks – especially in multi-agent tasks – do not go beyond observations with simple noise; nonetheless, real scenarios induce more elaborate vision pipeline failures: false sightings, misclassifications or occlusion. In this work, we propose a lightweight, 2D environment for robot soccer and autonomous driving that can emulate the above discrepancies. Besides establishing a benchmark for accessible multi-agent reinforcement learning research, our work addresses the challenges the simulator imposes. For handling realistic noise, we use self-supervised learning to enhance scene reconstruction and extend curiosity-driven learning to model longer horizons. Our extensive experiments show that the proposed methods achieve state-of-the-art performance, compared against actor-critic methods, ICM, and PPO.</p> </abstract>ARTICLE2022-02-23T00:00:00.000+00:00Energy Associated Tuning Method for Short-Term Series Forecasting by Complete and Incomplete Datasets<abstract><title style='display:none'>Abstract</title><p> This article presents short-term predictions using neural networks tuned by energy associated to series based-predictor filter for complete and incomplete datasets. A benchmark of high roughness time series from Mackay Glass (MG), Logistic (LOG), Henon (HEN) and some univariate series chosen from NN3 Forecasting Competition are used. An average smoothing technique is assumed to complete the data missing in the dataset. The Hurst parameter estimated through wavelets is used to estimate the roughness of the real and forecasted series. The validation and horizon of the time series is presented by the 15 values ahead. The performance of the proposed filter shows that even a short dataset is incomplete, besides a linear smoothing technique employed; the prediction is almost fair by means of SMAPE index. Although the major result shows that the predictor system based on energy associated to series has an optimal performance from several chaotic time series, in particular, this method among other provides a good estimation when the short-term series are taken from one point observations.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Kernel Analysis for Estimating the Connectivity of a Network with Event Sequences<abstract><title style='display:none'>Abstract</title><p> Estimating the connectivity of a network from events observed at each node has many applications. One prominent example is found in neuroscience, where spike trains (sequences of action potentials) are observed at each neuron, but the way in which these neurons are connected is unknown. This paper introduces a novel method for estimating connections between nodes using a similarity measure between sequences of event times. Specifically, a normalized positive definite kernel defined on spike trains was used. The proposed method was evaluated using synthetic and real data, by comparing with methods using transfer entropy and the Victor-Purpura distance. Synthetic data was generated using CERM (Coupled Escape-Rate Model), a model that generates various spike trains. Real data recorded from the visual cortex of an anaesthetized cat was analyzed as well. The results showed that the proposed method provides an effective way of estimating the connectivity of a network when the time sequences of events are the only available information.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Can Learning Vector Quantization be an Alternative to SVM and Deep Learning? - Recent Trends and Advanced Variants of Learning Vector Quantization for Classification Learning<abstract><title style='display:none'>Abstract</title><p> Learning vector quantization (LVQ) is one of the most powerful approaches for prototype based classification of vector data, intuitively introduced by Kohonen. The prototype adaptation scheme relies on its attraction and repulsion during the learning providing an easy geometric interpretability of the learning as well as of the classification decision scheme. Although deep learning architectures and support vector classifiers frequently achieve comparable or even better results, LVQ models are smart alternatives with low complexity and computational costs making them attractive for many industrial applications like intelligent sensor systems or advanced driver assistance systems.</p><p>Nowadays, the mathematical theory developed for LVQ delivers sufficient justification of the algorithm making it an appealing alternative to other approaches like support vector machines and deep learning techniques.</p><p>This review article reports current developments and extensions of LVQ starting from the generalized LVQ (GLVQ), which is known as the most powerful cost function based realization of the original LVQ. The cost function minimized in GLVQ is an soft-approximation of the standard classification error allowing gradient descent learning techniques. The GLVQ variants considered in this contribution, cover many aspects like bordersensitive learning, application of non-Euclidean metrics like kernel distances or divergences, relevance learning as well as optimization of advanced statistical classification quality measures beyond the accuracy including sensitivity and specificity or area under the ROC-curve.</p><p>According to these topics, the paper highlights the basic motivation for these variants and extensions together with the mathematical prerequisites and treatments for integration into the standard GLVQ scheme and compares them to other machine learning approaches. For detailed description and mathematical theory behind all, the reader is referred to the respective original articles.</p><p>Thus, the intention of the paper is to provide a comprehensive overview of the stateof- the-art serving as a starting point to search for an appropriate LVQ variant in case of a given specific classification problem as well as a reference to recently developed variants and improvements of the basic GLVQ scheme.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00A New Mechanism for Data Visualization with Tsk-Type Preprocessed Collaborative Fuzzy Rule Based System<abstract><title style='display:none'>Abstract</title><p> A novel data knowledge representation with the combination of structure learning ability of preprocessed collaborative fuzzy clustering and fuzzy expert knowledge of Takagi- Sugeno-Kang type model is presented in this paper. The proposed method divides a huge dataset into two or more subsets of dataset. The subsets of dataset interact with each other through a collaborative mechanism in order to find some similar properties within each-other. The proposed method is useful in dealing with big data issues since it divides a huge dataset into subsets of dataset and finds common features among the subsets. The salient feature of the proposed method is that it uses a small subset of dataset and some common features instead of using the entire dataset and all the features. Before interactions among subsets of the dataset, the proposed method applies a mapping technique for granules of data and centroid of clusters. The proposed method uses information of only half or less/more than the half of the data patterns for the training process, and it provides an accurate and robust model, whereas the other existing methods use the entire information of the data patterns. Simulation results show the proposed method performs better than existing methods on some benchmark problems.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00A Survey of Artificial Intelligence Techniques Employed for Adaptive Educational Systems within E-Learning Platforms<abstract><title style='display:none'>Abstract</title><p>The adaptive educational systems within e-learning platforms are built in response to the fact that the learning process is different for each and every learner. In order to provide adaptive e-learning services and study materials that are tailor-made for adaptive learning, this type of educational approach seeks to combine the ability to comprehend and detect a person’s specific needs in the context of learning with the expertise required to use appropriate learning pedagogy and enhance the learning process. Thus, it is critical to create accurate student profiles and models based upon analysis of their affective states, knowledge level, and their individual personality traits and skills. The acquired data can then be efficiently used and exploited to develop an adaptive learning environment. Once acquired, these learner models can be used in two ways. The first is to inform the pedagogy proposed by the experts and designers of the adaptive educational system. The second is to give the system dynamic self-learning capabilities from the behaviors exhibited by the teachers and students to create the appropriate pedagogy and automatically adjust the e-learning environments to suit the pedagogies. In this respect, artificial intelligence techniques may be useful for several reasons, including their ability to develop and imitate human reasoning and decision-making processes (learning-teaching model) and minimize the sources of uncertainty to achieve an effective learning-teaching context. These learning capabilities ensure both learner and system improvement over the lifelong learning mechanism. In this paper, we present a survey of raised and related topics to the field of artificial intelligence techniques employed for adaptive educational systems within e-learning, their advantages and disadvantages, and a discussion of the importance of using those techniques to achieve more intelligent and adaptive e-learning environments.</p></abstract>ARTICLE2016-12-17T00:00:00.000+00:00Performance Analysis of Rough Set–Based Hybrid Classification Systems in the Case of Missing Values<abstract> <title style='display:none'>Abstract</title> <p>The paper presents a performance analysis of a selected few rough set–based classification systems. They are hybrid solutions designed to process information with missing values. Rough set-–based classification systems combine various classification methods, such as support vector machines, k–nearest neighbour, fuzzy systems, and neural networks with the rough set theory. When all input values take the form of real numbers, and they are available, the structure of the classifier returns to a non–rough set version. The performance of the four systems has been analysed based on the classification results obtained for benchmark databases downloaded from the machine learning repository of the University of California at Irvine.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A Novel Fast Feedforward Neural Networks Training Algorithm<abstract> <title style='display:none'>Abstract</title> <p>In this paper<sup>1</sup> a new neural networks training algorithm is presented. The algorithm originates from the Recursive Least Squares (RLS) method commonly used in adaptive filtering. It uses the QR decomposition in conjunction with the Givens rotations for solving a normal equation - resulting from minimization of the loss function. An important parameter in neural networks is training time. Many commonly used algorithms require a big number of iterations in order to achieve a satisfactory outcome while other algorithms are effective only for small neural networks. The proposed solution is characterized by a very short convergence time compared to the well-known backpropagation method and its variants. The paper contains a complete mathematical derivation of the proposed algorithm. There are presented extensive simulation results using various benchmarks including function approximation, classification, encoder, and parity problems. Obtained results show the advantages of the featured algorithm which outperforms commonly used recent state-of-the-art neural networks training algorithms, including the Adam optimizer and the Nesterov’s accelerated gradient.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Decision Making Support System for Managing Advertisers By Ad Fraud Detection<abstract> <title style='display:none'>Abstract</title> <p>Efficient lead management allows substantially enhancing online channel marketing programs. In the paper, we classify website traffic into human- and bot-origin ones. We use feedforward neural networks with embedding layers. Moreover, we use one-hot encoding for categorical data. The data of mouse clicks come from seven large retail stores and the data of lead classification from three financial institutions. The data are collected by a JavaScript code embedded into HTML pages. The three proposed models achieved relatively high accuracy in detecting artificially generated traffic.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A Novel Grid-Based Clustering Algorithm<abstract> <title style='display:none'>Abstract</title> <p>Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the <italic>k<sub>dist</sub></italic> function which computes the distance between each element of a dataset and its <italic>k</italic>th nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A New Statistical Reconstruction Method for the Computed Tomography Using an X-Ray Tube with Flying Focal Spot<abstract> <title style='display:none'>Abstract</title> <p>This paper presents a new image reconstruction method for spiral cone- beam tomography scanners in which an X-ray tube with a flying focal spot is used. The method is based on principles related to the statistical model-based iterative reconstruction (MBIR) methodology. The proposed approach is a continuous-to-continuous data model approach, and the forward model is formulated as a shift-invariant system. This allows for avoiding a nutating reconstruction-based approach, e.g. the advanced single slice rebinning methodology (ASSR) that is usually applied in computed tomography (CT) scanners with X-ray tubes with a flying focal spot. In turn, the proposed approach allows for significantly accelerating the reconstruction processing and, generally, for greatly simplifying the entire reconstruction procedure. Additionally, it improves the quality of the reconstructed images in comparison to the traditional algorithms, as confirmed by extensive simulations. It is worth noting that the main purpose of introducing statistical reconstruction methods to medical CT scanners is the reduction of the impact of measurement noise on the quality of tomography images and, consequently, the dose reduction of X-ray radiation absorbed by a patient. A series of computer simulations followed by doctor’s assessments have been performed, which indicate how great a reduction of the absorbed dose can be achieved using the reconstruction approach presented here.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Performance Analysis of Data Fusion Methods Applied to Epileptic Seizure Recognition<abstract> <title style='display:none'>Abstract</title> <p>Epilepsy is a chronic neurological disorder that is caused by unprovoked recurrent seizures. The most commonly used tool for the diagnosis of epilepsy is the electroencephalogram (EEG) whereby the electrical activity of the brain is measured. In order to prevent potential risks, the patients have to be monitored as to detect an epileptic episode early on and to provide prevention measures. Many different research studies have used a combination of time and frequency features for the automatic recognition of epileptic seizures. In this paper, two fusion methods are compared. The first is based on an ensemble method and the second uses the Choquet fuzzy integral method. In particular, three different machine learning approaches namely RNN, ML and DNN are used as inputs for the ensemble method and the Choquet fuzzy integral fusion method. Evaluation measures such as confusion matrix, AUC and accuracy are compared as well as MSE and RMSE are provided. The results show that the Choquet fuzzy integral fusion method outperforms the ensemble method as well as other state-of-the-art classification methods.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00A New Hand-Movement-Based Authentication Method Using Feature Importance Selection with the Hotelling’s Statistic<abstract> <title style='display:none'>Abstract</title> <p>The growing amount of collected and processed data means that there is a need to control access to these resources. Very often, this type of control is carried out on the basis of bio-metric analysis. The article proposes a new user authentication method based on a spatial analysis of the movement of the finger’s position. This movement creates a sequence of data that is registered by a motion recording device. The presented approach combines spatial analysis of the position of all fingers at the time. The proposed method is able to use the specific, often different movements of fingers of each user. The experimental results confirm the effectiveness of the method in biometric applications. In this paper, we also introduce an effective method of feature selection, based on the Hotelling T<sup>2</sup> statistic. This approach allows selecting the best distinctive features of each object from a set of all objects in the database. It is possible thanks to the appropriate preparation of the input data.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Evaluating Dropout Placements in Bayesian Regression Resnet<abstract> <title style='display:none'>Abstract</title> <p>Deep Neural Networks (DNNs) have shown great success in many fields. Various network architectures have been developed for different applications. Regardless of the complexities of the networks, DNNs do not provide model uncertainty. Bayesian Neural Networks (BNNs), on the other hand, is able to make probabilistic inference. Among various types of BNNs, <italic>Dropout as a Bayesian Approximation</italic> converts a Neural Network (NN) to a BNN by adding a dropout layer after each weight layer in the NN. This technique provides a simple transformation from a NN to a BNN. However, for DNNs, adding a dropout layer to each weight layer would lead to a strong regularization due to the deep architecture. Previous researches [1, 2, 3] have shown that adding a dropout layer after each weight layer in a DNN is unnecessary. However, how to place dropout layers in a ResNet for regression tasks are less explored. In this work, we perform an empirical study on how different dropout placements would affect the performance of a Bayesian DNN. We use a regression model modified from ResNet as the DNN and place the dropout layers at different places in the regression ResNet. Our experimental results show that it is not necessary to add a dropout layer after every weight layer in the Regression ResNet to let it be able to make Bayesian Inference. Placing Dropout layers between the stacked blocks i.e. Dense+Identity+Identity blocks has the best performance in Predictive Interval Coverage Probability (PICP). Placing a dropout layer after each stacked block has the best performance in Root Mean Square Error (RMSE).</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Mixup (Sample Pairing) Can Improve the Performance of Deep Segmentation Networks<abstract> <title style='display:none'>Abstract</title> <p>Researchers address the generalization problem of deep image processing networks mainly through extensive use of data augmentation techniques such as random flips, rotations, and deformations. A data augmentation technique called mixup, which constructs virtual training samples from convex combinations of inputs, was recently proposed for deep classification networks. The algorithm contributed to increased performance on classification in a variety of datasets, but so far has not been evaluated for image segmentation tasks. In this paper, we tested whether the mixup algorithm can improve the generalization performance of deep segmentation networks for medical image data. We trained a standard U-net architecture to segment the prostate in 100 T2-weighted 3D magnetic resonance images from prostate cancer patients, and compared the results with and without mixup in terms of Dice similarity coefficient and mean surface distance from a reference segmentation made by an experienced radiologist. Our results suggest that mixup offers a statistically significant boost in performance compared to non-mixup training, leading to up to 1.9% increase in Dice and a 10.9% decrease in surface distance. The mixup algorithm may thus offer an important aid for medical image segmentation applications, which are typically limited by severe data scarcity.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00Anomaly Pattern Detection in Streaming Data Based on the Transformation to Multiple Binary-Valued Data Streams<abstract> <title style='display:none'>Abstract</title> <p>Anomaly pattern detection in a data stream aims to detect a time point where outliers begin to occur abnormally. Recently, a method for anomaly pattern detection has been proposed based on binary classification for outliers and statistical tests in the data stream of binary labels of normal or an outlier. It showed that an anomaly pattern can be detected accurately even when outlier detection performance is relatively low. However, since the anomaly pattern detection method is based on the binary classification for outliers, most well-known outlier detection methods, with the output of real-valued outlier scores, can not be used directly. In this paper, we propose an anomaly pattern detection method in a data stream using the transformation to multiple binary-valued data streams from real-valued outlier scores. By using three outlier detection methods, Isolation Forest(IF), Autoencoder-based outlier detection, and Local outlier factor(LOF), the proposed anomaly pattern detection method is tested using artificial and real data sets. The experimental results show that anomaly pattern detection using Isolation Forest gives the best performance.</p> </abstract>ARTICLE2021-10-08T00:00:00.000+00:00en-us-1