Using Machine Learning Techniques to Predict Hospital Admission at the Emergency Department

Abstract Introduction One of the most important tasks in the Emergency Department (ED) is to promptly identify the patients who will benefit from hospital admission. Machine Learning (ML) techniques show promise as diagnostic aids in healthcare. Aim of the study Our objective was to find an algorithm using ML techniques to assist clinical decision-making in the emergency setting. Material and methods We assessed the following features seeking to investigate their performance in predicting hospital admission: serum levels of Urea, Creatinine, Lactate Dehydrogenase, Creatine Kinase, C-Reactive Protein, Complete Blood Count with differential, Activated Partial Thromboplastin Time, DDi-mer, International Normalized Ratio, age, gender, triage disposition to ED unit and ambulance utilization. A total of 3,204 ED visits were analyzed. Results The proposed algorithms generated models which demonstrated acceptable performance in predicting hospital admission of ED patients. The range of F-measure and ROC Area values of all eight evaluated algorithms were [0.679-0.708] and [0.734-0.774], respectively. The main advantages of this tool include easy access, availability, yes/no result, and low cost. The clinical implications of our approach might facilitate a shift from traditional clinical decision-making to a more sophisticated model. Conclusions Developing robust prognostic models with the utilization of common biomarkers is a project that might shape the future of emergency medicine. Our findings warrant confirmation with implementation in pragmatic ED trials.


Introduction
The Emergency Department (ED) represents a key element of any given healthcare facility and retains a high public profile. ED staff manage patients with a huge variety of medical problems and deal with all sorts of emergencies. ED congestion resulting in delays in care remains a frequent issue that prompts the development of tools for rapid triage of high-risk patients [1]. Moreover, it is well documented that timely interventions are critical for several acute diseases [2,3]. One of the most commonly encountered ED priorities is to quickly identify those who will need hospital admission. Traditionally, this decision relies on clinical judgment aided by the results of laboratory tests. Human factors leading to diagnostic errors occur frequently and are associated with increased morbidity and mortality [4].
Machine Learning (ML) techniques show promise as diagnostic aids in healthcare and have sparked the discussion for their wider application in the ED [5]. Developing robust prognostic models with the utilization of common biomarkers to facilitate rapid and reliable decision-making regarding hospital admission of ED patients is a project that might shape the future of emergency medicine. However, relevant data from the ED is scarce. Recent studies have focused on clinical outcome and mortality prediction [6,7].
We assessed biochemical markers and coagulation tests that are routinely checked in patients visit-ing the ED, seeking to investigate their performance in predicting whether the patients will be admitted to the hospital. Our aim is to find an algorithm using ML techniques to assist clinical decision-making in the emergency setting.

Materials and methods
This research is a retrospective observational study conducted in the ED of a public tertiary care hospital in Greece that has been approved by the Institutional Review Board of Sismanogleio General Hospital (Ref. No 15177/2020, 5969/2021).
All raw data was retrieved from a standard Hospital Information System (HIS) and a Laboratory Information System (LIS). The analysis was performed using the Waikato Environment for Knowledge Analysis (WEKA) [8], a Data Mining Software in Java workbench.
The flow diagram of the study is depicted in Figure  1. A total of 3,204 ED visits were analyzed during the study period  May 2019). The anonymous data set under investigation contains eighteen features presented in Table 1.
To assess the performance of the best-performing model (Smith and Frank 2016) for our analysis in WEKA, we have used a 10-fold cross-validation approach to avoid overfitting; Cross-validation is widely regarded as a quite reliable way to assess the quality of results from machine learning techniques. WEKA [9][10][11]  Among many algorithms that were evaluated for our research purposes, in this article, we present only the eight best-performing algorithms, mainly in terms of ROC area and F-Measure.
During our experiments, we retained the default settings of all classification algorithms' original implementations provided by WEKA. Each algorithm was evaluated on two data sets; the original data set, including the missing values, and on the data set where the missing values were identified, and they were replaced with appropriate values using WEKA's ReplaceMissing-Values filter. Furthermore, since the number of patients in our data set who met clinical criteria for hospital admission (36.7%) is less than those who did not meet (63.3%), we applied WEKA's ClassBalancer technique [8] to prevent overfitting by reweighting the instances in the data set so that each class had the same total weight during the phase of model training.

Results
The performance of each algorithm was evaluated on its ability to predict whether a patient seen in the emergency department is subsequently admitted to the hospital or not by only taking into consideration the features presented in Table 1. All algorithms were evaluated on both datasets (original with missing values, modified by using the ReplaceMissingValues filter), and the detailed results are presented in Appendix (Tables A1-A16). The classification performance's results on the original data set, regarding the F-Measure and ROC Area of each algorithm, are summarized in Table  2 and Figure 2.
According to Table 2, considering the weighted average values, it can be seen that Logit boost slightly outperformed other models with respect to both F-measure and ROC Area with values of 0.708 and 0.774, respec-tively. We can also observe that the range of F-measure and ROC Area values of all eight algorithms that were evaluated are [0.679-0.708] and [0.734-0.774], respectively, and they can be considered acceptable [21].
The classification performance's results of F-Measure and ROC Area on the data set where the missing values have been replaced by using WEKA's ReplaceMissing-Values filter are summarized in Table 3 and Figure 3.
According to Table 3, considering the weighted average values, it can be seen that the Random Forest slightly outperformed other models with respect to both F-measure and ROC Area with values of 0.723 and 0.789, respectively. Additionally, we can observe that the range of F-measure and ROC Area values of all eight algorithms are [0.663-0.723] and [0.731-0.789], respectively, and as previously noted, they can also be considered acceptable. Furthermore, we were positively surprised to see that the impact of missing values on the classifiers' performance was less pronounced than we initially thought.
Furthermore, since the admitted patients were 1175 versus 2029 that were not admitted, we applied the WEKA's ClassBalancer technique on both datasets and re-evaluate the performance of the two classifiers (Logit boost and Random Forest). After the application of the ClassBalancer filter in the original data set, we observe that the performance of Logit boost (F-measure:0.693; ROC area:0.773) (Table A17) is quite similar to this of the imbalanced data set (F-measure:0.708; ROC area:0.774). Similar behavior, we also observe in the performance of Random Forest before (F-measure:0.723; ROC area:0.789) and after (F-measure:0.704; ROC area:0.784) ( Table A18) the application of Class- Balancer filter in the data set where the missing values have been replaced by using the ReplaceMissingValues filter.

Discussion
Based on the data from 3,204 adult ED visits, using common laboratory tests and basic demographics, we evaluated eight ML algorithms that generated models that can reliably predict the hospital admission of patients seen in the ED. Our study utilized pre-existing patient data from a standard HIS and LIS. Therefore, the methods proposed here can serve as a valuable tool for the clinician to decide whether to admit or not an ED patient. The main advantages of this tool include easy access, availability, yes/no result, and low cost. The clinical implications of our approach might be signifi-cant and might facilitate a shift from traditional clinical decision-making to a more sophisticated model. The application of machine learning techniques in the ED is not entirely new. Yet, it is not considered the standard of care. Current efforts are aiming to develop and integrate clinical decision support systems able to provide objective criteria to healthcare professionals. Our study is consistent with previous research showing that logistic regression is the most frequently used technique for model design. The area under the receiver operating curve (AUC) is the most frequently used performance measure [22]. Moreover, the major goal of such predictive tools is to identify high-risk patients accurately and differentiate them from stable, low-risk patients that can be safely discharged from the ED [23] and communicate this identification to the medical expert who can take this information into account while making a decision on admission or discharge.
The hectic pace of work and the stressful setting of the ED have negative consequences on patient safety [24,25]. It is well established that human factors play an important role in the efficiency of healthcare systems. Different error types have different underlying mechanisms and require specific methods of risk management [26]. A fearful shortcoming for the emergency physician is to fail to admit a seriously ill patient. Our methods might be useful to reduce these errors while explicitly acknowledging that they are meant to aid and not substitute clinical judgment.
In summary, we present an inexpensive clinical decision support tool derived from readily available patient

Fig. 3. Weighted Average values of F-Measure and ROC Area for all methods -ReplaceMissingValues filters (10-fold cross-validation)
data. This tool is intended to aid the emergency physician regarding hospital admission decisions, as the development of machine learning models represents a rapidly evolving field in healthcare.

Limitations
This study is not without limitations. In our analysis, we did not include clinical parameters such as the vital signs and the Emergency Severity Index (ESI) [27]. We aimed to investigate whether our model can identify hospital admissions without taking into account clinical data. Thus, we included limited input variables in order to present a low-cost decision support tool with the minimum available data from our HIS. Τhere were also missing values in the data we collected and analyzed; for example, not all of the analyzed ED visits had all the laboratory investigations available. Furthermore, our preliminary findings have not yet been followed up by an implementation phase, and the proposed algorithms have not been validated in a pragmatic ED trial. Therefore, future research is warranted in order to demonstrate whether they can actually improve care.

Conclusions
In this study, we evaluated a collection of very popular ML classifiers on data from an ED. The proposed algorithms generated models which demonstrated acceptable performance in predicting hospital admission of ED patients based on common biochemical markers, coagulation tests, basic demographics, ambulance utilization, and triage disposition to the ED unit. Our research confirms the prevalent current notion that the utilization of artificial intelligence may have a favorable impact on the future of emergency medicine.