JOURNAL BEARING PERFORMANCE PREDICTION USING MACHINE LEARNING AND OCTAVE-BAND SIGNAL ANALYSIS OF SOUND AND VIBRATION MEASUREMENTS

Journal and thrust bearings utilise hydrodynamic lubrication to reduce friction and wear between the shaft and the bearing. The process to determine the lubricant film thickness or the actual applied load is vital to ensure proper and trouble-free operation. However, taking accurate measurements of the oil film thickness or load in bearings of operating engines is very difficult and requires specialised equipment and extensive experience. In the present work, the performance parameters of journal bearings of the same principal dimensions are measured experimentally, aiming at training a Machine Learning (ML) algorithm capable of predicting the loading condition of any similar bearing. To this end, an experimental procedure using the Bently Nevada Rotor Kit 4 is set up, combined with sound and vibration measurements in the vicinity of the journal bearing structure. First, sound and acceleration measurements for different values of bearing load and rotational speed are collected and post-processed utilising 1/3 octave band analysis techniques, for parametrisation of the input datasets of the ML algorithms. Next, several ML algorithms are trained and tested. Comparison of the results produced by each algorithm determines the fittest one for each application. The results of this work demonstrate that, in a laboratory environment, the operational parameters of journal bearings can be efficiently identified utilising non-intrusive sound and vibration measurements. The presented approach may substantially improve bearing condition identification and monitoring, which is an imperative step to prevent journal bearing failures and conduct condition-based maintenance.


INTRODUCTION
There are several approaches used for condition monitoring and predictive maintenance of journal bearings, such as vibration, noise and acoustic emission monitoring and analyses, focusing on detecting and identifying patterns and trends in the recorded signals, and correlating them with present or upcoming fault conditions [1,2]. Further, lubricating oil and wear debris analyses [3] are commonly used for assessing lubricating oil quality [4], focusing on analysis of the size, shape, quantity and composition of wear particles generated during operation, correlating the findings to the machine condition, and determining the effective wear mechanisms (sliding, rubbing, rolling, abrasion, etc.). Among them, vibration analysis is the most popular in practical mechanical engineering applications, supported by a wide related literature, mainly for roller bearing condition assessment. On the other hand, in marine engineering applications, and particularly in the study of line and stern tube bearings of the propulsion shafts, the most applicable method is that of oil temperature monitoring, due to the very compact designs, accessibility restrictions and limited advanced sensor equipment onboard modern vessels.
In marine applications, the propulsion shaft is supported by a large number of journal bearings, forming a statically indeterminate multi-supported beam structure. The vertical offsets of the supporting bearings change during normal ship operation, mainly due to the different ship loading conditions and propeller immersion states, which in turn affect the load that each bearing supports. A proper shaft alignment plan is thus required, which determines appropriate bearing vertical offsets that lead to an equal distribution of loads among the supporting bearings. In operation, shaft alignment may be considerably influenced by hull deflections, due to different loading and environmental conditions. The robustness of the shaft alignment at different loading conditions of the ship should be carefully assessed [5], to avoid conditions where the bearings are either overloaded (which leads to operation with very small minimum film thickness, increased wear and inability to support sudden impact loads) or very lightly loaded, characterised by very limited vertical stiffness and thus prone to extreme lateral vibration levels and oil whirling (several case studies regarding whirling vibration problems have been presented in [6]). The current effective regulations for elastic shaft alignment in ships [7,8] demonstrate upper and lower bearing load limits for safe and reliable operation of the ship propulsion shaft.
Several studies have been conducted in the past focusing on defect identification in journal bearings utilising vibration or sound signals. Ma and Zhang have investigated in [9] the excitation mechanisms and contributions of tribofilm-asperity interaction that occur in the hydrodynamic lubrication regime of journal bearings. They also used the spatial power spectral density as a feature of the non-Gaussian roughness surfaces for early wear to analyse the microscopic pressure fluctuations, aiming to provide a new understanding for characterising noisy vibration signals for early wear monitoring of journal bearings. They extended this work in [10], focusing on the diagnosis of abrasive wear, to find out that wear-induced narrowband spatial components of the journal surface can excite random vibration of the bearing. The speed-dependent vibrational behaviour is found to be an effective indicator of surface defects. Additionally, vibration signals have also been used for bearing wear state detection by Wang et al. [11] and oil analysis for wear debris detection by Appleby [12]. Šaravanja and Grbešić in [13] highlight that the most important step in the vibrational diagnostics of journal bearings is the choice of measuring points, as well as the choice and mounting of sensors, most of which depend on the accuracy of the test and the results obtained. Lastly, according to Poddar in [14], vibration and acoustic emission are proven techniques in fault diagnosis of ball bearings and gears, but their applications to journal bearings have not been fully explored. Despite extensive research work being done on its design aspects, there is a dearth of studies on condition monitoring and fault diagnosis of journal bearings through vibration and acoustic emission. This work, on the other hand, aims at the development of journal bearing performance identification tools utilising ML techniques.
Python is one of the programming languages that has an ML module with most of the commonly used algorithms and gets regularly updated to meet new and more complex needs [15]. Due to the low computational cost and high computational speed, Python appears to be a simple and trustworthy choice. In tribology, ML algorithms such as Decision Trees and Support Vector Machines have been used in the literature initially for fault diagnosis purposes, especially in roller bearings, focusing on the feature selection methodology [16] or following a statistical feature selection [17] to extract critical information primarily from sound sensor signals [18] to assess the roller bearing operational state or fault. ML has also been used as a tool for journal bearing fault identification and more particularly by Salunkhe and Desavale in [19] as an intelligent method for the detection of bearing vibration characteristics. Rauber et al. proposed the utilisation of ML as a method for fault diagnosis based on vibration signals [20] and Umbrajkaar et al. have extended the utilisation of ML towards the identification of shaft-related performance parameters, conducting vibration analysis of the shaft misalignment under variable load conditions [21]. In this work, several ML algorithms will be tested utilising experimental data and features extracted utilising the octave band analysis for the prediction of several performance parameters of the bearing.

PROBLEM DESCRIPTION -METHODOLOGY
The present work is concerned with the development of ML algorithms to predict the real-time steady-state performance indices of journal bearings (load, minimum film thickness) over a wide range of bearing load and rotational speed values, utilising sound and vibration measurements. First, a set of experiments has been set up and conducted. In particular, journal bearings of the same principal dimensions have been prepared and tested experimentally for different combinations of bearing load and journal rotational speed. For each experiment, sound and vibration measurements in the vicinity of the journal bearing structure have been additionally performed, and the corresponding signals have been post-processed and stored. All experiments have been conducted on the Bently Nevada Rotor Kit 4 of the Laboratory of Marine Engineering, NTUA, which is equipped with a data acquisition system (DAQ), controlled by a LabView application for processing and storing all measurement data.
Next, a one-third octave analysis has been performed for recorded sound and vibration signals for different segments of the signal length; a source code written in Python performs the required calculations along with all the necessary adjustments. The frequency band domains produced from this analysis are used to generate the feature space of the ML algorithms. The importance of each feature varies depending on the problem at hand. The algorithms' results depend greatly on the information given through the features, and the user should pay attention to the selection process.
Based on the studied cases, we have further investigated the signals and features which enable determination of the load status of a bearing. The main goals of the study are summarised below: • Evaluate ML performance for (a) sound signals, (b) vibration signals, and (c) a combination of the above. • Investigate whether training data from one bearing can help predict the loading condition of a similar bearing and what features enable this function. • Determine whether combining training data from multiple bearings (of the same type/principal dimensions) can produce an algorithm that will enable accurate predictions for the entire bearing family.

MACHINE LEARNING ALGORITHMS
Over the past years, ML algorithms have attracted a lot of attention and have been extensively used for classification problems, aiming to recognise or predict different classes within a dataset. Selecting a programming language for ML and data science depends on the project or experience from previous projects. Python is an object-oriented programming language [22], widely used in ML and scientific applications, having libraries such as Scikit-learn and SciPy for ML and data analysis. The Scikit-learn ML library [23] is well documented with many examples and tutorials, features various classification, regression and clustering algorithms including support vector machines, random forests and gradient boosting, and cooperates very well with Python's numerical and scientific libraries NumPy and SciPy, as well as with the rest of the available open-source libraries. The common classification algorithms that we also utilised in this work are presented here.

k-Nearest Neighbours
k-Nearest Neighbours algorithms used for classification are simple and only require the storage of the training dataset. They create a space with as many dimensions as the number of the dataset's features, and do not build an internal model to aid with the prediction. This space is then populated with the training data. Given a new data point, the algorithm searches for the closest point in the multi-dimensional space. k represents the number of closest points that will participate in the majority vote to classify the new data point and its value is set manually. The class assigned to each entry is the one with the most representatives in the k nearest neighbours.

Decision Trees
The Decision Trees are models that create an order of if/else questions that ultimately lead to a prediction of the value of the target variable. Each question splits the data into smaller groups. The aim of the question is to split the data in the most efficient way in order to make a quick and accurate prediction. The number of questions asked is chosen by the user and is in principle one of the basic termination criteria of the algorithm [24,25]. Each newly formed group is called a node (decision node) and the size of the node could serve as a termination criterion (prediction node). The training process is performed including all data features.

Random Forest
Random Forests are a way to solve some inherent drawbacks of Decision Trees. A random forest is a set of many random trees that are differentiated from each other in terms of the data points used to build the tree and the features used for each split. The algorithm starts (b =Ź1) by drawing a sample from the whole training dataset and creating a tree (T) according to a set of features drawn from the available feature space.
Once the minimum node size n min is reached, the algorithm creates the next random tree until b = B, where B is the number of estimators (random forest trees) defined by the user. The end of the training results in an ensemble of trees {T} 1 B . After the random forest trees are created, the algorithm makes a prediction for each tree. If the model solves a regression problem, the algorithm averages the prediction branch results to produce a prediction for the new data point x; if the model is conducting classification, then the algorithm creates a voting strategy where every tree provides a probability for each class and then all the probabilities are averaged to target the most probable class.

Gradient Tree Boosting
In Gradient Tree Boosting methods, the algorithm generates trees in a "serial" way and each new tree attempts to correct the mistakes of the previous one. The user defines the tree size and aims to initialise with shallow trees, which is called pre-pruning. These shallow trees are called weak learners and their depth usually varies between two and five. Each weak learner has a small effect on the algorithm's prediction. As more trees (m) are added, the performance of the algorithm improves until the maximum number of trees is reached (M), or until the prediction accuracy is not improving any further after several iterations.

EXPERIMENTAL SETUP
The Bently Nevada Rotor Kit Model RK4 used in the experimental part of the present work consists of a Bently Nevada electric motor, coupled by means of a flexible coupling to a 10 mm steel shaft. The motor supports rotational speeds of up to 10,000 rpm, controlled by a Bently Nevada RK-4 Speed Control unit, which has a digital display to indicate the speed. The operator can monitor the current rotational speed of the device or set the desired operating speed. The controller measures the shaft rpm with the help of a proximity probe mounted on a suitably configured gear wheel. The shaft features a 24.5 mm diameter at the free-end for a length of 25.4 mm. The free-end part of the shaft is supported by a radial bearing; lubricating oil is supplied to the bearing by means of a Bently Nevada RK4 oil pump. The shaft eccentricity and attitude angle are measured by means of two perpendicularly mounted proximity probes. The shaft is additionally supported at the motor end by a simple dry radial bearing. Additional weights are attached to the shaft, which can be used to modify the bearing loading. Specifically, there are two cylindrical masses 75 mm in diameter and 25 mm in length, weighing 0.800 kg each [26] [27]. The bearing load can be adjusted by appropriate axial translation of the cylindrical masses. In the present experimental work, bearing loads of 2.0, 8.0 and 14.0 N have been considered. The motor and the bearings are mounted on a long, rigid steel base. The main dimensions of the shaft are presented in Table 1 and the Bently Nevada RK-4 experimental setup designed for the current experiment is presented in Fig. 1.
In the experimental part of this work, two almost identical journal bearings have been used, in particular a Plexiglas (Poly methyl methacrylate, PMMA) and an Acetal (Polyoxymethylene, POM) bearing. The first bearing is a ServoFluid Control Bearing designed, manufactured and assembled by Bently Nevada. The second bearing used is manufactured according to the design plans of Bently Nevada [26]. The inner diameter of the bearings has been measured with a three-point internal micrometer. The oil resistance properties of both bearings were tested, confirming that the dimensions and properties of the bearing would not change throughout the experimental procedure. The nominal geometric dimensions of both bearings (ServoFluid Control Bearing and Custom Acetal Bearing) are presented in Table 1. The Bently Nevada RK-4 ServoFluid Control Bearing components are presented in Fig. 2.
Bearing acceleration signals were measured using an ICP® Model 356A02 triaxial accelerometer with a hexagonal base. Its frequency range (±10%) spans between 0.5 and 6000 Hz and has a measurement range of ±500 g pk [28]. The hexagonal base of the accelerometer is mounted on the surface with the instant adhesive Loctite 454, and the accelerometer is then secured to the base. Also, an ICP® 130D21 Array Microphone, a prepolarised condenser microphone coupled with an ICP® sensor powered preamp, is utilised for sound pressure measurements. The frequency response of the sensor (-2 to 5 dB) is 20 to 15000 Hz [29].
In order to improve the accuracy of the sound pressure results obtained, a soundproof cover was designed and mounted on the top part of the bearing. This cover works beneficially in two ways: (a) it insulates the sound waves produced by the bearing assembly from external sound sources and (b) it absorbs the reflection waves from the assembly to reduce noise in the microphone. The sensor installation and setup for the current experiment are illustrated in Fig. 3.

SIGNAL COLLECTION -DATA PROCESSING
Before performing the experiments, it is necessary to properly prepare the experimental setup of the RK4 Speed Control Unit and the computer that will receive the results of the measurements of each experiment utilising NI LabVIEW. The accelerometer and the microphone are connected to a Model 482A22 ICP® Sensor Signal Conditioner. The conditioner is connected to the IoTech DaqBook 2000 [30], which will gather signals from different signal conditioners and simultaneously send them to the data acquisition card. The data acquisition card used is the IoTech DaqBoard 2001 and constitutes the input of the analog signal in the computer. The software installed on the computer is NI LabView 2017 [31,32]. The data acquisition is performed in single-ended mode and refers to the circuit's setup, in which the voltage is measured between one signal line and common ground voltage (Vcm).
The measurements are acquired from the microphone and the accelerometer with a sampling rate of 1000 samples/sec and the mean duration of each experiment is 30 seconds. These acquisition parameters are set in LabVIEW software, as illustrated in Fig. 4. The rotational speed of the shaft is different for each experiment and thus a one-minute window between the experiments is necessary. In this way, the effect of the transitional phenomena on the measurements is significantly reduced. The file produced after each experiment is a comma delimited values file (.csv) with every line containing an instance with five values: "sound pressure", "acceleration x", "acceleration y", "acceleration z" in mV and "rotational speed" in rpm. The measured values, given in mV, are calibrated in actual units (pressure, acceleration) using the following values as multipliers: Sound_Press = 33.8 mV/Pa, X_acc = 1.002 mV/m/s 2 , Y_acc = 0.990 mV/m/s 2 , Z_acc = 0.979 mV/m/s 2 .
In the present experimental work, the rotational speed is varied in the range from 500-4600 rpm, getting the specific discrete classes of rotational speeds: 500 / 1000 / 1800 / 2500 / 3300 / 4000 / 4600 rpm. A small variation of the speed during each experiment of ± 5 rpm is considered acceptable.

Octave band analysis
In the present work, the one-third octave band analysis is used to filter the acceleration and sound pressure signals. This type of analysis is chosen for two main reasons: 1. The frequency domain reveals frequency components and their individual amplitudes, 2. It can be combined with ML for feature extraction (in comparison to an FFT analysis).
Vibration signals of interest can extend between frequencies from 0.1 Hz to around 70 Hz, whereas noise signals can reach very high frequencies depending on the application (e.g., aircraft generate high frequency noise) [33]. In general, vibrational signals hold high energy at the lower spectrum range, while energy is substantially lower at high frequencies.
Here a low pass filter has been used which cuts off frequencies above 70 Hz. It should be noted here that, in actual working environments, high frequency vibrations are less useful, since they are influenced by the operation of other neighbouring machinery.
Out of the many types of frequency bands, the octave or one-third octave bands are the most frequently used for frequency analysis. In the present study, vibration and sound signals are sampled and processed utilising the octave band analysis to extract features. These independent features are used as input for the several ML algorithms tested, aiming to determine the performance parameters of the journal bearing in a series of case studies. The proposed algorithm should be agnostic of the current operation of the bearing. Thus, no fundamental frequencies were considered, which would impose a bias in the algorithms. Each band's power level, represented by its centre frequency, will be a feature, namely 0-70 Hz for vibration signals and 0-400 Hz for sound signals, will be named as features #1-19 and #20-#31 respectively for the training process of the ML algorithms. In Figs. 5 & 6 examples of the one-third octave analysis for a sound and acceleration Z signal respectively are illustrated.

MEASUREMENT PROCEDURE AND EXPERIMENTAL RESULTS
The measurement procedure can be briefly summarised as follows: 1. The bearing load is adjusted by appropriate axial translation of the cylindrical weights, 2. Bearing lubrication is started, 3. The bearing motor is started and the shaft is accelerated to reach the initial rotational speed (500 rpm). Rotation is maintained until the temperature is stabilised, following an exponential function, to reach asymptotically the constant value throughout the entire experiment.
For each different desired rotational speed: 1. The shaft is accelerated to the desired rotational speed.
Rotation is maintained until the lubricant temperature is stabilised, 2. After steady-state conditions have been reached, data recording is initiated for a period of 30 seconds. Several Experiments are conducted for different combinations of bearing load and shaft rotational speed. In particular, the rotational speed ranges from 500-4600 rpm, with intervals in the following vector: {500, 1000, 1800, 3300, 4000, 4600} rpm. Three different bearing loads have been considered, namely 2, 8 and 14 N. Therefore, the total set of experiments consists of 6x3 = 18 different states.
The acceleration and sound signals are converted using FFT into frequencies in order to apply the 1/3 octave filter into the frequency domain. It is very important to ensure that the experiment duration is adequate in order to collect and extract all the essential information from the dataset applying the 1/3 octave analysis. If the sampling duration is not long enough, then the low frequencies, which are essential especially in the vibration signal, will be filtered out.
Features extracted from the octave band analysis were evaluated based on their importance and their positive effect on ML prediction accuracy. Essentially, measurements for bearing operation in the large range of 500-4600 rpm are affected differently by central natural frequencies (features), which in turn affect the ML algorithm prediction accuracy. Additionally for a holistic study it is not possible to focus on different frequency features for each speed value and the selected features should be optimal for the entire range of rpm and frequency values. Thus, the limit was set to 70 Hz, in part aiming to highlight that the ML prediction accuracy can be very high (more than 90%, see e.g., Tables 3 & 4), although the natural frequency of some shaft rotational speeds within the range 500-4600 rpm is not included as a "special", more important feature.
Finally, the test data are divided into two categories. In the first category, the 80-20 rule for training and test data is selected and cross-validation is implemented, followed by evaluation of the algorithm's performance. The second category includes an additional set of test data, coming from a completely different experimental dataset, e.g., combinations of rotational speedload not included in the initial predefined values. The obtained data are used for training and testing of the ML algorithms according to the detailed needs of each case study.

CASE STUDY 1 -SIGNAL SELECTION
In order to determine which of the four signals (x, y, z acceleration and sound) performs optimally for the task at hand, a simple test is conducted, using as input one of these values each time. The target label depicts the mean load of the bearing. After the hyper-parameter tuning, the algorithms are executed and the results are evaluated. In this case study, the entire data set for a single bearing is used, except for the states of (1800 rpm -14 N) and (4600 rpm -2 N). The features utilised in the case studies are numbered as features #1-#19 to represent the power level of each one-third octave band, characterised by the respective centre frequency values of the vibration signal (0-70 Hz) and as features #20-#31 to represent the power level of each band, characterised by the respective centre frequency values of the sound signal (0-400 Hz). All features are normalised automatically before they are utilised as input in the ML algorithms tested. The score achieved for each classification and regression algorithm is presented in Table 2. The sound signal produces better results without overfitting of the data, the acceleration z signal is the second best, and acceleration x and y are the least efficient features for accurate predictions.
The best performing algorithms applied are the Random Forest Classifier (RFC), the k-Nearest Neighbours Classifier (KNN) and the Gradient Boosting Regressor (GBR). It should be noted that the acceleration x and y results are unstable during cross-validation, with very little positive effect on the overall prediction accuracy of the algorithms. Based on these findings, only the sound and acceleration z signals are utilised as input for training in the following case studies. In some clearly distinct cases, acceleration z signals achieve good standalone results, but in most cases, a combination of sound and acceleration z signals will be used to achieve the optimum results.

CASE STUDY 2 -RPM PREDICTION FOR A GIVEN LOAD (8N)
This case study aims to determine the rotational speed of the shaft by using the sound signal produced by the bearing. The training samples have one second duration and 30 samples are used for each rotational speed. The test data are randomly chosen from the training data pool constituting 10% of the total test data volume. The actual rpm value and the algorithm's prediction are then presented in order to evaluate its accuracy. Training and test data are selected only in the intermediate range of 8 N load for every rpm scenario. Also, features for sound signal space are used, numbered #20-31 representing one frequency band each. Fig. 8 is a radar type chart that illustrates the importance in [%] values of features #20-31 used in this study. Features #20-31 represent the acoustic level of each band, characterised by the respective centre frequency values of the sound signal (0-400 Hz). In the radial direction, the values of the concentric grey circles are 2%-37%, demonstrating the different effect that each feature may have on the accurate prediction of the shaft rotational speed in the selected rpm cases that were tested. Reviewing Fig. 8, one can observe that the frequency signature of each rotational speed differs, so the results of this case study are as expected. Fig. 9 is a pie chart type and shows the importance of each feature in the decision-making processes of the RFC and GBR algorithms respectively. This figure also demonstrates which frequency bands (features) are the most important for accurate ML predictions and thus can be an efficient tool for engineers who need to select which frequency signals to sample depending on the case. The algorithms used for feature importance selection are the RFC and the GBR. For both algorithms, a maximum tree depth of 3 is used and a number of estimators (trees) equal to 50 in order to avoid overfitting. The classification problem presented in Table 3 was very accurate, as expected by manually observing the feature values. The regression problem presented in Table 4 has a larger variance in the 500 rpm cases. Furthermore, the 4600 rpm value is predicted three times as 4572 rpm which, after data analysis, occurs due to the value similarity and the equally high importance of features #28 and #30. As a result, the rpm feature will be used as a predictor in some case studies, to take maximum advantage of the existing high precision equipment measuring the shaft rotational speed.

CASE STUDY 3 -LOAD PREDICTION FOR GIVEN RPM (ACCELERATION Z & SOUND SIGNAL)
In this case study, the goal is to determine the loading condition of the bearing using primarily (a) the acceleration z and secondarily (b) the sound signal. Thirty samples per load per rotational speed are used for training with a duration of 3 seconds. The 80%-20% rule is used to divide the training and test data. The algorithms tested are the RFC, the KNN and the GBR, utilising the entire acceleration z signal feature space, numbered as; #1-#19.   Fig. 10 shows the feature importance concluded utilising the RFC algorithm. Features #11, #12, #14 and #18 and the respective frequency bands are selected as the most important ones according to RFC. Furthermore, Fig. 11 shows a 3D visualisation of the multi-dimensional samples, where the three different loading cases are distinguished by colour. In Fig. 11 only three selected feature dimensions out of the total 20-dimensional mapping produced by the KNN are demonstrated, namely #12, #14 and rpm. The "k" value of nearest neighbours significantly affects the results of the algorithm and should be tuned during the training process; in this case study k = 4.
The results produced by each algorithm are shown in Tables 5 & 6. KNN has a class prediction accuracy ranging from 93-97% and RFC from 77-83%. GBR is not presented due to the low accuracy and high variance of the results. The prediction accuracy is improved by excluding the least important features from the training and testing process. An extensive feature selection process will be further examined in the next steps of this case study.
The second part of this case study, (b), aims to determine the loading condition of the bearing by using the sound signals acquired in the experimental procedure. The data format is the same and the algorithms used are again RFC, KNN and GBR, utilising the entire sound signal feature space, numbered as #20-#31.
In Fig. 12 the feature importance chart results for each algorithm are presented. The RFC has assigned at least a small importance percentage to every feature, due to its algorithmic rule to divide the importance probability between features. RFC has given the highest percentage to features #25, #28 and #31, the same features that the GBR has selected as the most important. It should be noted that in the GBR feature importance chart only the three most important features are visible because the rest have a very low importance score and are therefore excluded.
The results of the feature importance can be explained by reviewing Fig. 13. In this figure, it is visible that the three loading conditions are creating three separate areas of operation in terms of features #25, #28 and #31. This makes it possible for the algorithm to accurately predict the mean Fig. 10. Feature importance, RFC, Case study 3a Fig. 11. 3D KNN visualisation  load of the bearing and for the user to visualise the results and understand a very practical benefit arising from the octave band analysis applied in this technique. Note that the variable colour intensity marks the position of each point in the layer depth of the 3D plot. The confusion matrices of KNN and RFC for Case study 3b are shown in Table 7 and Table 8 respectively. The accuracy of both classifiers is higher than 97% and overfitting is avoided through proper hyper-parameter tuning and with application of 10-fold cross validation. In addition to the classifier results, the Gradient Boosting Regressor (GBR) achieved 99.57% accuracy of predictions, demonstrating also very promising results for the regression model.

CASE STUDY 4 -LOAD PREDICTION WITH SOUND SIGNAL, FOR UNTRAINED LOAD-RPM COMBINATIONS
In this case study several sets of rpm-load combinations are excluded from the training process and used only for testing. The scope of these experiments is to evaluate each algorithm in terms of its handling process for new data and to compare the ability of several ML algorithms to accurately predict unknown data combinations: (a) interpolating within the initial dataset grid or (b) extrapolating outside the dataset grid or (c) finding values at the border of the dataset grid. In cases III and V the rpm-load combinations are chosen to be the marginal combination values of the rotational speed and loading condition. The algorithms tested are the RFC, KNN and GBR. Other algorithms have also been tested but, overall, these had the highest accuracy. All sound signal features are used in every case. Each case will be named after the combination of "missing target data" that are not used for the training process. Most algorithms achieve very good accuracy in predictions of intermediary combinations of rpm-load, but the accuracy decreases dramatically when reaching the edge of the dataset grid. This implies that a fine mapping of the bearing's operation is needed to extract a trustworthy prediction model. Unstable predictions could indicate that the input belongs to an unmapped part of the bearing's operational states or may correlate to a specific failure mode.

CASE STUDY 5 -LOAD PREDICTION WITH SOUND SIGNAL, FOR TRAINING WITH THE ACETAL BEARING AND TESTING ON THE PLEXIGLASS BEARING (2500 & 4000 RPM)
Summarising the findings of Case studies 1-4, the application of ML algorithms to make predictions regarding the operational state of the bearing utilising the sound signal features extracted from the octave-band filter has been quite promising. In the previous case studies, training and test data were both extracted from the experimental data of one specific journal bearing. In this case study, the training data will be extracted from the data measured on the Acetal bearing and the testing data will be extracted from the data acquired from the Plexiglass bearing. It should be noted that the previous case studies were tested for both bearings and the results were identical.
Initially the training and test data of 8 & 14 N load at 2500 rpm and the entire sound signal feature space (features #20-#31) as well as the rpm are used. The RFC and KNN (k = 5) algorithms achieved accuracy of 100%. On the other hand, the GBR algorithm produced inaccurate results and a notable shift in the automatically selected high importance features is observed, thus these results are rejected.
Aiming to find the minimum number of features necessary to predict the Plexiglass bearing's loading condition, a different approach is tested. The importance of the features selected is evaluated for both the Plexiglass and the Acetal bearing, revealing that different frequency band features are the most important in each case. To solve this instability, a dataset consisting of data from both bearings is created and the feature selection is repeated, aiming to find the most important features for the group of similar bearings. Then the developed models could be trained, taking into account these features from the Acetal bearing dataset and tested for predictions on the Plexiglass bearing dataset.
The rotational speed value tested is 4000 rpm and 2, 8 and 14 N load cases are used for training but only 8 and 14 N for testing. The KNN algorithm was the most promising algorithm tested in this case study. In Table 9 the prediction accuracy of the developed models is presented, taking into account one additional important feature, rebuilding the model iteratively four times, until 100% accuracy is achieved.
In the last part of this case study, data at two rotational speed cases, namely 2500 & 4000 rpm, are combined in the feature selection process. Initially features #23, #20, #31, and #29 are used, achieving 81.82% accuracy with KNN, and then the next most important feature (#26) is added to achieve 100% prediction accuracy.

CASE STUDY 6 -TRAINING AND TESTING WITH ACETAL AND PLEXIGLASS BEARINGS, UTILISING SOUND SIGNAL
In this last case study, the goal is to examine the model accuracy utilising a mixed training and testing dataset from both the Acetal and the Plexiglass bearings. From the Acetal bearing operational state dataset the cases of; 2 N -4600 rpm and 14 N -1800 rpm are excluded to simulate some missing data "holes" on the grid boundary. Additionally, all the loading scenarios for 8 N and 14 N load of the Plexiglass bearing are utilised. The entire sound signal feature space (features #20-#31) is included. Initially, the scope is to build a model able to identify the bearing class, namely Acetal or Plexiglass, and conduct feature selection. The most promising algorithms tested were the KNN and RFC, achieving 100% accuracy.
The last but most important study includes the load label as the target value, given sound frequency features from both bearing datasets. The most promising algorithms tested with converging accuracy were the RFC and the GBR, starting with a feature importance selection followed by an optimum model development for predictions. The feature importance results of both algorithms are presented in Fig. 14. The features that contain the most essential information for both bearings are features #25 and #31, followed by feature #22. The overall prediction accuracy for RFC reached 100% and for GBR 94.5%. This relatively high accuracy for both algorithms suggests that a dataset combining input from multiple similar bearings, manufactured according to the same design plan, can significantly improve the stability and accuracy of the produced model, ensuring a wider application range.

SUMMARY -CONCLUSIONS
In the present work a Machine Learning procedure has been developed, aiming at predicting the loading condition of journal bearings, utilising real-time sound and vibration measurements. To this end, a set of experiments has been set up and conducted. Two journal bearings of the same principal dimensions, but slightly different final dimensions, within the design tolerances, have been prepared and tested experimentally for different combinations of bearing load and journal rotational speed. The experiments have been performed utilising the Bently Nevada Rotor Kit 4 of the Laboratory of Marine Engineering at NTUA. A series of measurements has been performed with different combinations of rotational speed lying in the range of 500-4600 rpm, and bearing load ranging between 2, 8 and 14 N. A microphone and a triaxial accelerometer have been used to measure sound pressure and vibration signals generated during bearing operation. A one-third octave filter has been applied to post-process the obtained signals. The filtered signals have been segmented into shorter duration samples and have been fed to the ML algorithms.
ML algorithms utilising the sound signal provided more accurate predictions, with prediction accuracy of the order of 98-100%; ML algorithms utilising the acceleration z signal were found to be second best with prediction accuracy results of the order of 85%. A variety of scenarios have been examined and the prediction accuracy of the algorithms has been shown to be adequate in most of the cases (more than 95%). The algorithms' performance has been shown to vary, depending on the test data; the algorithms perform better if the test data belong to an intermediary training rpm-load combination in comparison to cases where the test data belong to an extreme combination near the dataset boundaries. Additionally, it is possible to rank frequencies in terms of importance, using a feature selection process; however, this constitutes a casesensitive result, and further attention is required in order to generalise the conclusions.
Regarding the signal selection process, several important conclusions can been drawn: (i) Sound measurement signals contain more information regarding the system state, in comparison to vibration measurement signals; (ii) 1/3 octave band signal analysis is shown to be very effective in extracting the significant information from the signal; (iii) It is possible to determine the bearing loading condition from vibration measurements alone; however, the prediction accuracy is somewhat less in comparison to that corresponding to Tab. 9. Feature selection 4000 rpm, Case study 5 Feature Selection Table   Iteration Features Added Accuracy sound measurements; (iv) For the development of a general bearing model, a wide range mapping of the bearing operating spectrum is required; (v) The use of multiple similar bearings for ML algorithm training may significantly improve their accuracy and trustworthiness.
In the present paper, the prediction of bearing load given the shaft rotational speed and utilising additional features extracted from vibration and sound signals (following the octave band analysis and ML) has been proven possible and a promising technique for journal bearing performance assessment. Especially for marine applications, where line and stern tube bearings cannot be surveyed in real time, an early sign of bearing overload is of particular importance, in order to avoid situations of fast wear growth and failure. It is emphasised that, today, for the vast majority of such bearings, monitoring of the operational state is normally performed by oil temperature measurements. However, oil temperature rises due to bearing malfunction are generally observed after the phenomenon has evolved, leading in many cases to extensive or even catastrophic bearing failures. Therefore, the proposed methodology can be extremely efficient for real-time nonintrusive bearing performance assessment.