Solution to Analysis of IT System User Behaviour Using AI/ML Algorithms

– Insufficient user involvement, lack of user feedback, incomplete and changing user requirements are some of the critical reasons for the difficulty of IS usage, which could potentially reduce the number of customers. Under the previous authors’ research, the method for analysing the behaviour of IT system users was developed, which was intended to improve the usability of the system and thus could increase the efficiency of business processes. The developed method is based on the use of graph searching algorithms, Markov chains and Machine Learning approach. This paper focuses on detailing of method output data in the context of definition of their importance based on expert evaluation and demonstration of visual presentation of different UX analysis situations. The paper briefly reminds the essence of the method, including both the input and output data sets, and, with the help of experts, evaluates the expected result in the context of their importance in UX analysis. It also introduces visualization prototype developed to obtain the output data, which allows verifying the input/output data transformation possibilities and expected data acquisition potential.


I. INTRODUCTION
Nowadays, the impact of the Fourth Industrial Revolution (Industry 4.0) [1] is not limited to industrial production processes. The development of Industry 4.0-related technologies has encouraged their application beyond traditional industries and has led to significant changes in the organization and management of any business process. The growth of Industry 4.0-related technologies is also influenced by the availability of the Internet, communication technology, and (real-time) processor computing power. All this contributes to the leading role of digitization not only in the industrial but also in the social and private sectors, as evidenced by a number of studies [2]- [4]. Current market trends determine the growing need to digitalize any business process that can be transformed from manual to digital, as this improves the efficiency of service execution and reduces the cost of providing it.
As the volume of different IT systems within one organisation increases, so does the complexity of the systems and the requirements for the usability of the systems. This poses new challenges for the development and application of relevant *Corresponding author's email: Vitalijs.Zabinako@abcsoftware.lv methods in the development and maintenance of IT systems with a strong focus on digital user behaviour.
Both the growth of digital services and their rapid transformation into a cloud service environment, as well as digital process support methods, do not guarantee that they are created for the convenient use of IT systems by all stakeholders. In addition, the COVID pandemic has further demonstrated the growing role of digitization in everyone's lives, and it is clear that the world will no longer be what it used to be. A number of solutions in different areas will be used digitally in parallel with on-site services, and there will be an increasing number of solutions that will remain only digital when they go live.
Insufficient user involvement, lack of user feedback, incomplete and changing user requirements are some of the critical reasons for the difficulty of using digital solutions. This could potentially reduce the number of customers. Currently, researchers are focusing on the development of user experience analysis methods and tools. One of today's challenges is the development of an up-to-date integrated solution that, firstly, would be able to analyse the use of a particular IT solution (system) and, secondly, be universal without requiring additional resources for selecting and preparing specific inputs for user experience analysis, i.e., would allow for the automatic accumulation of all necessary input information and generate the expected result.
The object of research is information systems, in which logs of events or activities are created, in which the activities of users authorized in the system are analysed. Authorized user activity records contain information about user attributes and behavioural parameters that can be used to determine usability characteristics, impact factors, efficiency criteria and other usability heuristics. The hypothesis of the research is that the obtained usability characteristics provide a basis for determining the information necessary to improve the user experience.
The paper is structured as follows. The next section describes related background of the method, the third section outlines the importance of input data with the help of experts, and the fourth section describes the verification of expected output data visualization potential. Last section is related to application case. Conclusions are made about the present research.

II. BACKGROUND OF THE METHOD FOR THE USER BEHAVIOUR ANALYSIS
The method developed in the previous study of the authors [5] envisages the analysis of user activities from the point of view of IT system improvement in the context of usability. The results of the study are intended to be used to incorporate the user behaviour analysis method into the Prototype of the Suspicious Activity Identification (SAI) software product [6]. Appropriate integration requires that data from users and their sessions be used to define goals. From the beginning, the "final steps" users want to reach are identified and all sessions that contain such steps are selected. Next, the definition of the goals themselves and the creation and customization of the behavioural profiles of the relevant users are performed. It allows for automatic analysis of the created profiles against specific target criteria, as well as -at the original "Suspicious Activity Identification" software product stage -monitoring and analysis of new user session activities for specific purposes. As a result, all accumulated information is effectively visualized, which is useful for the UX analyst.
The User Behaviour Analysis Method is expected to complement the Prototype of Suspicious Activity Identification (SAI) software prototype with the following functionality: • Ability to define and analyse the achievement of the user's goals. • Ability to store additional demographic data about the user (date of creation of the user account, age, gender, etc.). • Ability to define user groups based on user demographics.
• Possibility to perform model training for all dynamic user groups. • Presentation of user group analysis.
• Presentation of monitoring results of new user sessions. As a result, the method developed within the study makes it possible to identify those user groups or technical processes that show relatively inefficient/unstable usage of IT system indicators. The task is to identify the groups of users who need additional training, as well as to identify those financial industry data processing activities (too long, unstable, etc.) that negatively affect the overall user performance of the system (execution speed and accuracy) to provide recommendations for software usability improvements.
The solution to any problem can be represented as an algorithm that processes possible input data and allows "calculating" the necessary output data. In addition, for the full development and potential implementation and application of such an algorithm, it is necessary to rethink not only the content of input and output data, but also the format of input data receiving and the form of output data transmission and visualization. The case of UX analysis [5] presents source and target data for UX improvement and algorithms for target data retrieval.
The determination of the input data is based on the analysis of user experience (UX) [7], customer experience (CX) [8] and various sets of system usability heuristics [9], [10] and efficiency criteria listed in different scientific papers [11]- [13], which are used for business processes and system usability efficiency improvement. The sufficient data set as input data according to [5] should contain audit records of user activities and user profile characteristics.
The specifics of the described input and output data and the study of their possible transformations made it possible to base the implementation of user behaviour analysis method on graph processing algorithms and machine learning methods, because the input data set in the form of audit records can be represented as vertices with transitions between these, which form a graph, and the output data set requires training and comparison of such graphs. The method proposed by [5] assumes that it is possible to obtain all output data defined as different characteristics of IS usage and users. The output data set defined in [5] is the following: • The shortest path from source to target.
• The longest path from source to target.
• The most popular journey.
• Typical behaviour scenarios for different user groups.
• Typical behavioural scenarios to achieve the set goal.
• Wrong transition from one step to another (with return to previous step). • The most often visited step (action).
• The most rarely visited step (action).
• Not visited step. • Repeated action within a particular scenario.
• Atypical action within a particular scenario; • The shortest time performing a particular step (potentially redundant action). • The longest delay in a particular action. • User emotions and sentiments. Current research is focused on: firstly -to assess whether all input data are necessary and important in UX analysis in the context of IS usability improvement (discussed in Section III), and secondly -to test transformation implementation potential by creating visualization sketches of all input data (presented in Section IV).

III. DETERMINING THE IMPORTANCE OF INPUT DATA THROUGH EXPERT ESTIMATION
There are very few characteristics in the use of IS that can be quantified. In order to be able to quantify the user journey and individual activities and, thus, influence the usability of IS, it is important to define the quality characteristics of the methodology. One of the possible ways is to form a list of user journey quality indicators, assess the importance of each indicator with the help of experts and find out the degree of compliance of a particular IS with specific requirements. By user journey, the authors understand the "path" of interface views and actions that the user follows in executing their task. The task is a larger set of actions -including actions outside the information system -for achieving a particular goal.
Expert survey methods can be used to determine the characteristics that must be met in the analysis of the user experience. One of the most common methods for obtaining a harmonized opinion of experts on the importance of a certain set of criteria is the expert interaction procedure -Delphi method [14]. The Delphi method consists of a series of procedures that allow the expert group to form an opinion on issues that seem to have less information than necessary.
In this section, the Delphi method is applied to the list of user attributes and activities defined as the input data set in the previous section, in order to determine whether the input data defined in the method are equally important from the point of view of UX specialists and, thus, check the quality of one of the method components.

A. Formation of an Expert Group
The formation of an expert group involves the selection of experts. The group of experts must consist of 10-20 people and this group must be formed according to a certain procedure. One of the necessary rules is equal competence of experts. In order to assess the importance of the required characteristics of UX analysis, IT specialists are invited, who, within the scope of their duties, deal with the analysis of user behaviour in the use of IS. As a result of the survey, the answers of 20 respondents representing 12 companies in various positions have been collected. The positions held by the respondents in the UX analysis are shown in Fig. 1. The figure shows that the number of positions is higher than the number of respondents, this is due to the fact that respondents were able to indicate several positions within which they performed user experience analysis activities.

B. Making of a Questionnaire
In the classic version, the expert's work starts from a "clean" ("white") page, which is the case when the expert initially offers their own list of indicators against which the quality of the methodology can be assessed. However, in order to facilitate the work of the experts, it is worth conducting the first experiment, starting with the evaluation of the indicators of the proposed questionnaire, which have already been chosen by the organisers of the experiment. Following the results of the survey, the initial version of the list of indicators has been revised, taking into account the opinions of experts. In the method validation task, the initial list of users and their behavioural characteristics were already obtained from the list of method components. Therefore, the experts were immediately offered to evaluate the lists of characteristics: 1. -Department (for IS inner users) The experts were offered two lists -one with the characteristics of user behaviour (journeys and their activities), and the other with the characteristics of the user. The original lists for validation were slightly modified, both by combining different characteristics of one type into more general one (e.g., more often/less/never) and by dividing one characteristic into a few more general ones (e.g., age characteristic).
In order to check the attitude of experts in the selection of assessments, the lists were extended with duplicates -two different names for the same or totally opposite characteristics (e.g., pairs like 1.1. and 1.16, etc.). Different characteristics of the same meaning must be valued in the same way. In case of different assessments, the answers of such an expert should be taken out of the questionnaires of the respondents [15]. After the analysis of the submitted respondents' evaluations, the answers of one respondent have been removed from the questionnaire (Fig. 1 shows the positions of 20 respondents of already accepted questionnaires).

C. Filling in the Questionnaire
Experts were invited to complete the questionnaire with the importance rate according to the following scale: 0 -Useless information, in the opinion of an expert. 1-3 -Information that can be ignored in the UX. 4-5 -Useful but not very important information. 6-7 -Information that might be usable in the UX analysis. 8-9 -Information that must be available in the user UX. 10 -Very important and useful information (that might not be available yet). The Delphi method [14] is a series of sequential procedures designed to form the opinion of a group of experts. The interaction of the experts with the organisers of the experiment is ensured by performing some iterations, the results of each iteration are processed by appropriate statistical methods and communicated to the experts. There are usually four iterations, but if the insufficient level of coherence of expert opinions has been reached earlier, the survey may be terminated. As part of the evaluation of the characteristics of users and their activities, one iteration was performed, at the end of which a sufficiently high level of harmonization of expert opinions was achieved.

D. Processing of the Obtained Results
From the evaluations of all questionnaires, a matrix of expert opinions is compiled, in which the values of each characteristic of each expert are indicated. For better visibility in the report, the ratings are presented in the form of graphs, which are shown for the characteristics of user behaviour in Fig. 2 and for the characteristics of user attributes in Fig. 3.
The values of each expert are ranked as follows: the maximum grade is given a rank of 1, the minimum -10 (number of activity characteristics) or 9 (number of attribute characteristics). If some properties have the same marks, then the arithmetic mean value is taken between the corresponding occupancy numbers.
For each indicator, its importance and average importance ̅ are calculated (1): where m -the number of characteristics, n -the number of experts.
For each indicator, the deviation of its importance from the average is calculated, square of deviations 2 and a sum of squares : (

2)
For each expert is calculated: where -the number of equal ranks for the i-th expert.
The concordance or consensus coefficient K is calculated for the experiment, which shows the coherence of the experts: (4)  The value of the consensus coefficient is between 0 and 1. If the consensus coefficient is significantly less than 1, then it is necessary to perform another iteration of the survey. Prior to the relevant iteration, it is necessary to talk to the experts to find out why their opinions differ. The closer the coefficient is to 1, the more the opinions of the experts are agreed. Rank matrix and mathematical transformations are not included in the report. The consensus coefficient of the experiment under consideration in determining the importance of the list of activity characteristics is equal to 0.63 and in the list of attribute characteristics -to 0.52.
In the experiment under review, the most important characteristics that UX specialists noted as necessary in the analysis of user experience are listed below (both lists are arranged in descending order of their importance): 1. Characteristics of user behaviour (journeys): 9. Income level. The obtained results demonstrate that all UX characteristics defined in the components of the method as the desired/expected input data are located at the top of the resulting lists with a sufficiently high level of harmonization of expert opinions. In addition, by marking different types of characteristics in different coloured fonts, it can be seen that UX specialists are primarily interested in journey in the characteristics of user activities, i.e., sets of actions, followed by individual actions as such, and the third priority is timerelated criteria, with the duration of the action being more important than the duration of the interval between actions. Analysing the list of user attributes, it can be seen that UX specialists are primarily interested in experience of using IS and age; moreover, the retirement age threshold is more important than adulthood, and gender and income level are at the bottom of the list.
Additionally, experts mentioned that it would be desirable to identify the functional areas of the information system that the user had difficulty finding, perceiving and using. Another important aspect mentioned was the frequency of travel in terms of time -how many times a day, a week or a month had to be travelled. It was mentioned that it would be necessary to find out how user's travel differed based on the personal characteristics of a particular user -goals, motivation and specific requirements. It was also noted that automatically obtained information about the user experience only partially described the causes of the problems, and it was also necessary to ask the users themselves -What exactly happened and affected their experience?
Regarding the data visualization itself -experts mentioned easy-to-understand visual diagrams, visual scenarios, comparative graphs, "bar" diagrams, user "journey maps"', tables and pivot graphs, explanatory diagrams. It was further mentioned that it would desirable that all this would be as interactive as possible and with the possibility to both look at the "overall picture" and obtain detailed "deeper" information on a case-by-case basis.

IV. VERIFICATION OF THE EXPECTED OUTPUT DATA VISUALIZATION POTENTIAL
The previous section confirms the lists of expected user activities and attributes as output data that are planned to be obtained by the validated method. The answers of experts to the open-ended questions of the questionnaire provided a justification for the solutions to displaying the input data. In this section, the output data will be examined from the point of view of whether it is possible to design a visual representation of each data set in the form of journey maps representing user's journeys. The description of the implementation potential of the obtained input data will be clarified, defining the regularities between the output and the respective input data. A common scheme for visualizing all validation situations is shown in Fig. 4. Starting the analysis, the UX specialist can adjust the selection of users according to the given attributes and define the time period for which they want to perform the analysis of user's sessions. The selection of attributes is based on the results of the survey described in the previous section, where the most important user characteristics from the expert's point of view are offered for user selection. As a result, a graphical representation of all selected user's journeys is shown, which gives the opportunity to select a description of the user's activities, in the context of which it is intended to perform UX analysis. This choice is determined by the list of the most important user behaviour characteristics compiled in the previous section of the report, which are grouped according to the characteristics of the journey and the characteristics of individual user activities. The list of the cases offered for analysis by the proposed tool is the following: 1. Goals: Case of the most/less popular goal shows a demonstration of the journeys within the selected time period, and each achieved goal is depicted with different colour, depending on the number of users who achieved the appropriate goal. Statistical information in the form of a bar chart shows a number of users for each goal, by ordering these from the most popular (top) to least popular (bottom). This makes it possible to identify the most and least popular and visited goals.
Case of achieved / unachieved goal shows a demonstration of the journey corresponding to the description of the achieved/unattained goals. The user activity (as the vertex of the graph) is depicted in red (depicted as dark grey in Fig. 4), which is selected by the UX analyst and from which deviations from the target are to be calculated (journeys whose initial stage "A"-"L" coincides but "Goal n" is not reached at the end of the journey). Statistical information in the form of a bar chart shows deviations from the target for different vertices, so the UX analyst can conclude about the activities with the highest percentage that, if the user has reached this activity, then the goal will be reached, and the activities that cause the user to "leave" the journey. This makes it possible to identify specific stages of the journey that need to be changed (or, at the very least, to think about improving these).
Case of the most/less frequently performed journeys shows a reflection of a user's journey for the most / less frequent journey. Green (depicted as light grey in Fig. 4) indicates user activities (as graph vertices) that are more often performed by users when visiting the IS -the so-called more popular path. Statistical information in the form of a bar chart shows the number of different journeys as percentage, completing the journey through various activities to achieve a certain goal. This information makes it possible to conclude about activities -in which there are possible delays that the user turns away from the most popular journey and performs additional actions. Relevant information would be useful for a UX expert to "optimize" possible user "journey" scenarios. Case of the number of actions performed by the user during the journey shows the description of the number of actions performed by the user on the journey. Green colour (depicted as light grey in Fig. 4) shows user's activities (as graph vertices), which form a journey with minimally visited peaksthe so-called shortest path. Statistical information in the form of a bar chart shows the number of user journeys completed using the shortest journey (with a minimum number of activities) and the number of user journeys completed using the maximum number of activities visited. Depending on the amplitude of the journey length, the percentage of journeys completed in the middle of the range is also displayed. This information enables the UX analyst to observe the delay in achieving the goal that a higher percentage of users had to reach on the longest possible journey, as well as to assess the "complexity" of user journey in general from a quantitative point of view. Case of the time spent by the user during the journey until the goal is achieved shows all journeys for the selected goal. The journey that took the least amount of time from start to end is highlighted in green (depicted as light grey in Fig. 4). Statistical information in the form of a bar chart shows appropriate percentage of users who spent least, average, and maximum time while reaching their goals. This information enables the UX analyst to evaluate reaching of which goal was mostly delayed by users, and also -to make overall evaluation of user's journey "complexity" from the point of time spent.
Case of the most/less often performed actions shows the number of completions of each action on a user's journey in different colours depending on the user's visits. The minimum and maximum number of visits are defined, and depending on these, each vertex is coloured in its own defined colour gradient from lightest (less frequently performed) to darkest (more frequently performed). In this image, the vertices that are most frequently visited are marked in red (depicted as dark grey in Fig. 4), and they are located on the most popular journey, and light yellow represents a less frequent branch of activity. Statistical information in the form of a bar graph represents the percentage of visits to certain activities from a maximum to a minimum with a predefined range, i.e., depending on the range of visits, the percentage of user visits in the middle of the range is also displayed. This information allows the UX analyst to observe which activity may have a catch that a higher percentage of users had to attend, and conversely, if the activity is infrequent or never visited, there may be a reason to consider replacing or merging it with another activity.
Case of the shortest/longest time to complete the action shows a demonstration of the journey corresponding to the characteristics of the shortest/longest time performance. The user activity (as the highlighted node of the graph) selected by the UX analyst is highlighted in red (depicted as dark grey in Fig. 4), and the time spent on all user journeys in this activity is calculated, and the statistical information shows the percentage of journeys from the minimum time spent in this activity to the maximum time spent. Statistical information corresponds to the shortest/longest performance characteristic. Information relevant to this characteristic enables the UX analyst to judge activities that require more time to complete if a higher percentage of user journey is closer to the maximum time and may need special attention, and vice versa, higher concentration of user journey is closer to the minimum time, because the user spends less time in a certain action and with this action everything is fine from the point of view of UX analysis. In addition to the level of "complexity" obtained in this section, it also makes it possible to assess the quantitative indicators of final time spent.
V. APPLICATION CASE Cases described in the previous section with appropriate representation of journey maps, analytical information and potential conclusions supported by the tool have been applied to the data obtained from one month working with the internet bank. UX analysis cases have been run to the user sessions and the obtained results indicating several situations requiring attention are presented to the customer and are highly appreciated by bank UX specialists. It approves the potential implementation success of the method offered and tool support. Due to the information about internet bank usage being highly confidential, these situations are not allowed to publish, but, in general, all of them were corresponding to nontrivial sequence of steps inside the journeys, abnormal time spent on several actions, as well as all the statistical information was quite interesting and useful for further business process efficiency improvement.

VI. CONCLUSION AND FUTURE RESEARCH
Summarising the above information, it can be concluded that the most important characteristics (activities and attributes) of the user behaviour analysis were successfully identified and validated by experts in the field (with a sufficiently high consensus rate, which further indicates the accuracy of the work and conclusions). Thanks to the results of this expert survey, several scenarios for the analysis of user data and their behaviour (with appropriate visualization, which would be convenient for UX specialists) have been developed and proposed, which will allow for reasonable reasoning about user activities in any information system, their "journey" and common success of the IS usage experience.
Additional opinions received from experts on important user (and their experience) attributes allow us to successfully develop further the work done and offer new scenarios for the analysis of user behaviour, which would have a high degree of efficiency and effectiveness for the work of UX specialists.
In addition, experiments with the necessary data visualization, that each characteristic of the result has sketches of the proposed input data (journey maps and analytical information), confirm both that the input data set is complete, consistent and sufficient to obtain the input data, and that the output data can be represented in a form used in UX analysis.
The goal of the outsourced service to validate the operations of the technological components of the initial version of the system model has been achieved and the validation tasks defined to achieve the goal have been fulfilled: 1. A list of characteristics (activities and attributes) has been compiled for the analysis of user behaviour. 2. Based on the analysis of the accumulated expert data, the topicality of the defined list has been confirmed and the most important characteristics have been determined. 3. A common scheme for visualization of validation situations has been developed. 4. Visualization UI sketches have been developed for each validation situation. 5. The potential realization of the evaluation of each characteristic is described with the justification of possible conclusions of UX analysis. In the course of the research, it has been established that the output data defined in the method [5] are important from the point of view of experts (UX specialists) and it is possible to obtain them from a defined input data set. The visualization potential of each output data set with the implementation description confirms that the output data set corresponds to what was expected.