- Open Access
Predicting mortality among septic patients presenting to the emergency department–a cross sectional analysis using machine learning
BMC Emergency Medicine volume 21, Article number: 84 (2021)
Sepsis is a life-threatening condition, causing almost one fifth of all deaths worldwide. The aim of the current study was to identify variables predictive of 7- and 30-day mortality among variables reflective of the presentation of septic patients arriving to the emergency department (ED) using machine learning.
Retrospective cross-sectional design, including all patients arriving to the ED at Södersjukhuset in Sweden during 2013 and discharged with an International Classification of Diseases (ICD)-10 code corresponding to sepsis. All predictions were made using a Balanced Random Forest Classifier and 91 variables reflecting ED presentation. An exhaustive search was used to remove unnecessary variables in the final model. A 10-fold cross validation was performed and the accuracy was described using the mean value of the following: AUC, sensitivity, specificity, PPV, NPV, positive LR and negative LR.
The study population included 445 septic patients, randomised to a training (n = 356, 80%) and a validation set (n = 89, 20%). The six most important variables for predicting 7-day mortality were: “fever”, “abnormal verbal response”, “low saturation”, “arrival by emergency medical services (EMS)”, “abnormal behaviour or level of consciousness” and “chills”. The model including these variables had an AUC of 0.83 (95% CI: 0.80–0.86). The final model predicting 30-day mortality used similar six variables, however, including “breathing difficulties” instead of “abnormal behaviour or level of consciousness”. This model achieved an AUC = 0.80 (CI 95%, 0.78–0.82).
The results suggest that six specific variables were predictive of 7- and 30-day mortality with good accuracy which suggests that these symptoms, observations and mode of arrival may be important components to include along with vital signs in a future prediction tool of mortality among septic patients presenting to the ED. In addition, the Random Forests appears to be a suitable machine learning method on which to build future studies.
Sepsis is defined as “life-threatening organ dysfunction caused by a dysregulated host response to infection”  and has an annual incidence of approximately 840/100000 in Sweden  and the incidence is increasing . Sepsis mortality is high and caused almost one in five deaths worldwide in 2017 . Improving our understanding of the importance of the clinical presentation of septic patients could enable early identification of patients likely to have a poor outcome . This is relevant since early treatment with antibiotics and fluid resuscitation has been shown to reduce mortality [6, 7].
The current sepsis criteria are based on the Sequential Organ Failure Assessment (SOFA) score which are selected, at least in part, so as to include patients with an infection with a predicted poor outcome; measured as a mortality of approximately 10% . Several scoring tools have been suggested for predicting sepsis mortality in the emergency department (ED) setting [8, 9]. However, these scoring tools, are based on vital signs and have a limited accuracy [8, 9]. Vital signs alone are, however, insufficient predictors, as one in five patients with severe infection have normal vital signs in the ED.  Thus, illustrating the need of an approach other than using vital signs to identify septic patients at risk of poor outcome.
We therefore chose to include measures of the presentation of the septic patients to the emergency department using machine learning. With increased computer power, together with the ability of self-learning and the capability to handle big data, different machine learning models have come to be used in the health care system with promising results . Interestingly, sepsis mortality has previously been shown to be able to be predicted by using the Random Forest, a machine learning method . This demonstrates that machine learning methods can be beneficial for improving the prediction accuracy.
Variables reflective of symptom presentation have been shown to be predictive of mortality among septic patients [13, 14]. Although the prior study using machine learning methods did include a large number of variables  none of them were reflective of symptoms at presentation. Therefore, the aim of the current study was to identify variables predictive of 7- and 30-day mortality among septic patients presenting to the ED based on the clinical presentation using machine learning.
Study design and setting
This was a retrospective cross-sectional study using previously identified variables  reflective of the presentation of septic patients arriving to the ED at Södersjukhuset. The hospital is located in Stockholm, and has more than 120,000 annual ED visits . The study period was between January 1st 2013 and December 31st 2013.
The inclusion criteria were patients ≥18 years of age, admitted to in-hospital care via the ED at Södersjukhuset and discharged from in-hospital care with an International Classification of Disease, Tenth Revision, (ICD-10) code corresponding to sepsis (A02.1, A22.7, A26.7, A32.7, A39.2, A39.4, A40.0 – A40.3, A48 - A49, A41.0 - A41.5, A41.8 - A41.9, A42.7, B37.7, R57.2, R65.0–65.1).
The exclusion criteria were healthcare-associated infection (HCAI), defined as sepsis onset after 48 h from arrival to the ED,  patients arriving by emergency medical services (EMS) with ongoing treatment for sepsis or other infectious diseases, unknown mode of arrival and the lack of personal identification number and medical record.
Definitions and predictive variables
Sepsis was defined as discharge from in-hospital care with an ICD-10 code corresponding to sepsis as specified above. Data was collected when the SEPSIS-2 criteria were in use . The study population included both EMS patients, arriving by ambulance or helicopter, and non-EMS patients, including all other means of arrival to the ED. The definition of severe sepsis was in accordance with a prior definition adapted for emergency care .
A total of 90 previously identified variables reflecting the clinical presentation of septic patients to the ED (i.e. vital signs, symptoms, observations and information from medical history, see Supplementary figure 1) were used,  in addition to mode of arrival. I.e. a total of 91 variables were included and used as input for the machine learning methods, as described below.
Ethical approval and consent to participate
Ethical approval was obtained from the regional review board (“Regionala Etikprövningsnämnden i Stockholm”) in Stockholm, diary number 2012/1288.31/3 and 2015/1019–32. All methods were carried out in accordance with relevant guidelines and regulation. Informed consent was waivered by the regional review board in Stockholm as the current study was retrospective and based on a review of medical records.
IBM SPSS Statistics for Macintosh, version 26.0 (IBM corp., Armonk, N.Y., USA) was used for the descriptive analysis, i.e. calculating mean, median, confidence interval and interquartile range for the characteristics of the study population. Shapiro-Wilks test was used to test for normality.
Balanced random forests
The supervised machine learning models were developed using the Balanced Random Forest Classifier from the Imblearn collection . This method can be used to build prediction models and to identify associations between specific variables and predicted outcome in unbalanced data. The Balanced Random Forest Classifier technique approaches the challenge of an imbalanced dataset by under-sampling the majority class (bootstrapping) and applying ensemble learning . Thus, the class distribution is changed in order to represent classes equally in each tree; in this case “Patients who died within 7 or 30 days” and “Patients who survived”. Prior study has shown that under-sampling is a more effective method to balance data compared to over-sampling,  explaining the choice of using under-sampling in the current study. A 10-fold cross validation was implemented with an 80 to 20 percentage train:test distribution. In each fold, the Balanced Random Forest Classifier included 100 decision trees. Each decision tree was created from a randomly selected subset of the fold’s training set through bootstrapping and included an equal number of patients who died and survived. The ensemble learning method called feature bagging was implemented in the development of each tree, randomly selecting a subset of variables, equal in size to the square root of all variables, to be tested in each node split. When the model makes a prediction, it is based on the majority vote from each of the 100 decision trees. The fold’s test set was used to determine the accuracy, described as area under the ROC curve (AUC), sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (LR) and negative LR. The mean value of the accuracy from all 10 folds is presented in the results. The SHapley Additive explanation (SHAP) interpreter was used to illustrate the relationship between specific predictive variables and the outcome. An exhaustive search was made describing how the mean AUC changes depended on the number of variables included to determine the number of variables to include in the final model. For each iteration of the exhaustive search, the least important variable, described in Gini Impurity, was excluded. See Supplementary figure 2 and 3 for details.
In accordance with inclusion and exclusion criteria, a total of 445 patients were enrolled in the current study. See Fig. 1.
The median age was 73 years (IQR 59–84) and 52.6% were men. A total of 323 patients arrived to the ED via EMS and the median length of in-hospital stay was 4 days (IQR 2–9). A total of 63 (14.2%) patients died within 7 days and 98 (22.0%) patients died within 30 days. See Table 1 for characteristics of the study population.
Predicting 7-day mortality
The accuracy of the model predicting 7-day mortality did not improve by including more than six variables, see Supplementary figure 2. These six most important variables for predicting 7-day mortality were in descending order: “fever”, “abnormal verbal response”, “low oxygen saturation”, “arrival by EMS”, “abnormal behaviour or level of consciousness” and “chills”. See Table 2 for the prevalence of these variables in the study population.
When only these six variables are included, the Balanced Random Forest achieved a sensitivity of 0.84 (CI 95%, 0.78–0.89), specificity 0.67 (CI 95%, 0.64–0.70), PPV 0.31 (CI 95%, 0.28–0.33), NPV 0.96 (CI 95%, 0.95–0.97), positive LR 2.61 (CI 95%, 2.32–2.90) and negative LR 0.24 (CI 95%, 0.16–0.33). A ROC curve was calculated and AUC = 0.83 (CI 95%, 0.80–0.86), see Supplementary Figure 4. To illustrate the relationship between these six most important variables and 7-day mortality, see Fig. 2.
Predicting 30-day mortality
The accuracy of the model predicting 30-day mortality did not improve by including more than six variables, see Supplementary figure 3. These six most important variables for predicting 30-day mortality were in descending order: “abnormal verbal response”, “fever”, “chills”, “arrival by EMS”, “low oxygen saturation” and “breathing difficulties”. See Table 3 for the prevalence of these variables in the study population.
When only these six variables are included, the Balanced Random Forest achieved a sensitivity of 0.87 (CI 95%, 0.81–0.93), specificity 0.64 (CI 95%, 0.61–0.67), PPV 0.41 (CI 95%, 0.39–0.44), NPV 0.95 (CI 95%, 0.92–0.97), positive LR 2.45 (CI 95%, 2.22–2.68) and negative LR 0.20 (CI 95%, 0.11–0.30). A ROC curve was calculated and AUC = 0.80 (CI 95%, 0.78–0.82), see Supplementary figure 5. To illustrate the relationship between the six most important variables and 30-day mortality, see Fig. 3.
A sub-analysis was made using the five most important variables and replacing the variable breathing difficulties with tachypnea. A ROC curve was calculated and AUC = 0.77 (CI 95%, 0.75–0.79), see Supplementary figure 6.
The results show that the following six variables: “fever”, “abnormal verbal response”, “low oxygen saturation”, “arrival by EMS”, “abnormal behaviour or level of consciousness” and “chills” were the most important variables for predicting 7-day mortality among septic patients presenting to the ED. The variables “abnormal verbal response”, “fever”, “chills”, “arrival by EMS”, “low oxygen saturation” and “breathing difficulties” were the most important variables for predicting 30-day mortality. We suggest the current study be viewed as a proof-of-principle that machine learning, specifically Random Forest, is a suitable tool with which to include a large number of variables reflecting the presentation of sepsis.
It is interesting that the model when using mode of arrival, abnormal verbal response, abnormal behaviour or level of consciousness, chills and two vital signs (low saturation and fever) could predict 7-day mortality with a good AUC and sensitivity. Several of these variables are not included in previously suggested predictive tools [8, 9, 12]. Prior studies support that low oxygen saturation, abnormal level of consciousness, the absence of chills, fever and rigors in addition to hyperglycaemia, are predictive of poor outcome for septic patients [13, 14]. It may be surprising that the results suggest that fever and chills are protective in sepsis, ie have a lower mortality. However, we speculate that fever and chills can represent a well-functioning immune response. Another explanation to a better outcome could be that patients with these symptoms are more easily recognised as septic, hence receive timely treatment. Prior studies included either septic patients arriving with EMS or patients with septic shock, [13, 14] while the current study includes all septic patients presenting to the ED-setting. Thus, is likely to include septic patients with a greater variation of disease severity. To our knowledge, there is no prior study including this number of variables reflecting the presentation in the association between clinical presentation to the ED and sepsis mortality.
Overall, the accuracy was similar for the model predicting 30-day mortality as that for the 7-day mortality. It is, however, notable that the model predicting 30-day mortality included “breathing difficulties” and not “abnormal behaviour or level of consciousness”. Furthermore, no change was seen regarding the models performance when breathing difficulties was replaced with tachypnea. Thus, showing that symptoms and signs related to breathing are important variables when predicting 30-day mortality for septic patients. Considering both models, we suggest that the model predicting 7-day mortality is more clinically relevant for ED practitioners as knowledge regarding short-term outcome could be used for bedside clinical decision-making in the ED.
The clinical presentation of sepsis is often non-specific and variable . We suggest that this makes the application of machine learning suitable, due to its ability to identify associations not previously considered among a large number of variables and that its potential could be increased in larger data sets. The Random Forest Classifier has been applied with promising results in several fields such as forecasting gross domestic product growth  and urban planning,  and also predicting sepsis mortality . Thereby, illustrating that the six most important variables identified in the current study could be included to support disposition decisions and predicting the likelihood of poor outcome which may have an impact on resource utilization and allocation. However, more importantly, we believe the current study to be proof-of-principle of the possibilities of the method. The current study is not without limitations. First of all, the data was collected when SEPSIS-2 criteria were in use . Some of the septic patients included in the current study may not be classified as septic in accordance with SEPSIS-3 . However, sepsis according to SEPSIS-3 is more prevalent than severe sepsis according to sepsis-2 criteria  and almost half of the current study population had severe sepsis. Therefore, it is reasonable to assume that a large proportion of the current study population would be septic also in accordance with SEPSIS-3. The use of ICD-10 codes to define sepsis may also be questioned. Although it is previously described that only one in seven of septic patients are identified when using ICD-10 codes,  it is however likely that these are the sickest septic patients receiving ICD 10 codes relating to sepsis, which is also supported by the large proportion of patients with severe sepsis. In addition, identifying septic patients based on ICD-10 codes is a common method in registry studies . Considering the use of the Random Forest, each decision tree is trained on bagged data using random selection of features, therefore gaining a full understanding of the decision process is difficult. However, with the implementation of SHAP interpreter, it is possible to illustrate the association between specific predictive variables and the outcome. Random forest was chosen as it is a tree-based model, which follows a decision pattern clinically used in emergency medicine and controls for the over-fitting otherwise observed in other tree-based models. The decision to use six variables in the final model is to a certain extent arbitrary. However, the AUC did not increase by including more variables, as presented in Supplementary figure 2 and 3. From a low- or middle-income country’s perspective, where IT-systems may not be available to the same extent, one could argue that fewer variables are better as this simplifies the use of a predictive model. Despite the limitations of the current study, the results indicate that there is an association between specific variables measurable in the ED and sepsis mortality. Further studies are needed to evaluate the use of these specific six variables in models or tools predictive of outcome, but more importantly to use the Random Forest and other machine learning methods to enable the analysis of a large amount of data to build predictive models.
The results indicate that the following variables, measurable in the ED, were predictive of both 7-day and 30-day mortality among septic patients: fever, abnormal verbal response, low saturation, arrival by EMS and chills. Moreover, abnormal behaviour or level of consciousness was predictive of 7-day mortality, while breathing difficulties was more important for predictive 30-day mortality. These results suggest that symptoms, observations and mode of arrival may be important variables to include in a future prediction tools of mortality among patients with suspected sepsis. These results do, however, need to be validated in other cohorts. In addition, the Random Forests appears to be a suitable machine learning method on which to build future studies.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request. The costume code used for the statistical analyses can be found in the following DOI link: https://doi.org/10.5281/zenodo.4642088
International classification of diseases
Area under the ROC curve
Positive predictive value
Negative predictive value
SHapley Additive explanation
Emergency medical services
Sequential organ failure assessment
Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The third international consensus definitions for Sepsis and septic shock (Sepsis-3). JAMA. 2016;315(8):801–10. https://doi.org/10.1001/jama.2016.0287.
Ljungström L, Andersson R, Jacobsson G. Incidences of community onset severe sepsis, Sepsis-3 sepsis, and bacteremia in Sweden - a prospective population-based study. PLoS One. 2019;14(12):e0225700. https://doi.org/10.1371/journal.pone.0225700.
Iwashyna TJ, Cooke CR, Wunsch H, Kahn JM. Population burden of long-term survivorship after severe sepsis in older Americans. J Am Geriatr Soc. 2012;60(6):1070–7. https://doi.org/10.1111/j.1532-5415.2012.03989.x.
Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan DR, et al. Global, regional, and national sepsis incidence and mortality, 1990-2017: analysis for the global burden of disease study. Lancet. 2020;395(10219):200–11. https://doi.org/10.1016/S0140-6736(19)32989-7.
Edman-Wallér J, Ljungström L, Jacobsson G, Andersson R, Werner M. Systemic symptoms predict presence or development of severe sepsis and septic shock. Infect Dis (Lond). 2016;48(3):209–14. https://doi.org/10.3109/23744235.2015.1104719.
Ferrer R, Martin-Loeches I, Phillips G, Osborn TM, Townsend S, Dellinger RP, et al. Empiric antibiotic treatment reduces mortality in severe sepsis and septic shock from the first hour: results from a guideline-based performance improvement program. Crit Care Med. 2014;42(8):1749–55. https://doi.org/10.1097/CCM.0000000000000330.
Leisman DE, Goldman C, Doerfler ME, Masick KD, Dries S, Hamilton E, et al. Patterns and outcomes associated with timeliness of initial crystalloid resuscitation in a prospective Sepsis and septic shock cohort. Crit Care Med. 2017;45(10):1596–606. https://doi.org/10.1097/CCM.0000000000002574.
Brink A, Alsma J, Verdonschot RJCG, Rood PPM, Zietse R, Lingsma HF, et al. Predicting mortality in patients with suspected sepsis at the emergency department; a retrospective cohort study comparing qSOFA, SIRS and National Early Warning Score. PLoS One. 2019;14(1):e0211133. https://doi.org/10.1371/journal.pone.0211133.
Jones AE, Saak K, Kline JA. Performance of the mortality in emergency department Sepsis score for predicting hospital mortality among patients with severe sepsis and septic shock. Am J Emerg Med. 2008;26(6):689–92. https://doi.org/10.1016/j.ajem.2008.01.009.
Suffoletto B, Frisch A, Prabhu A, Kristan J, Guyette FX, Callaway CW. Prediction of serious infection during prehospital emergency care. Prehosp Emerg Care. 2011;15(3):325–30. https://doi.org/10.3109/10903127.2011.561411.
Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ. 2019;7:e7702. https://doi.org/10.7717/peerj.7702.
Taylor RA, Pare JR, Venkatesh AK, Mowafi H, Melnick ER, Fleischman W, et al. Prediction of in-hospital mortality in emergency department patients with Sepsis: a local big data–driven, Machine Learning Approach. Acad Emerg Med. 2016;23(3):269–78. https://doi.org/10.1111/acem.12876.
Olander A, Andersson H, Sundler AJ, Bremer A, Ljungström L, Andersson Hagiwara M. Prehospital characteristics among patients with sepsis: a comparison between patients with or without adverse outcome. BMC Emergency Medicine. 2019;19(1):43. https://doi.org/10.1186/s12873-019-0255-0.
Filbin MR, Lynch J, Gillingham TD, Thorsen JE, Pasakarnis CL, Nepal S, et al. Presenting symptoms independently predict mortality in septic shock: importance of a previously unmeasured confounder. Crit Care Med. 2018;46(10):1592–9. https://doi.org/10.1097/CCM.0000000000003260.
Wallgren UM, Bohm KEM, Kurland L. Presentations of adult septic patients in the prehospital setting as recorded by emergency medical services: a mixed methods analysis. Scand J Trauma Resusc Emerg Med. 2017;25(1):23. https://doi.org/10.1186/s13049-017-0367-z.
About Södersjukhuset [internet]. Södersjukhuset AB. 2019. http://www.sodersjukhuset.se/om-sos/fakta-om-sodersjukhuset/. Accessed 23 Mar 2020.
The national board of health and welfare; 2011. Definition of health care associated infection. https://termbank.socialstyrelsen.se/?TermId=436&SrcLang=sv. Accessed 13 Jul 2020.
Levy MM, Fink MP, Marshall JC, et al. 2001 SCCM/ESICM/ACCP/ATS/SIS international Sepsis definitions conference. Intensive Care Med. 2003;29(4):530–8. https://doi.org/10.1007/s00134-003-1662-x.
Wallgren UM, Castrén M, Svensson AEV, Kurland L. Identification of adult septic patients in the prehospital setting: a comparison of two screening tools and clinical judgment. Eur J Emerg Med. 2014;21(4):260–5. https://doi.org/10.1097/MEJ.0000000000000084.
Balanced Random Forest Classifier from the Imblern collection. https://imbalanced-learn.readthedocs.io/en/stable/generated/imblearn.ensemble. BalancedRandomForestClassifier.html. Accessed 17 May 2020.
Chen C, Liaw A, Breiman L. Using random Forest to learn imbalanced data. Department of statistics 2004. https://statistics.berkeley.edu/sites/default/files/tech-reports/666.pdf. Accessed 18 Sept 2020.
Drummond C, Holte RC. C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling. 2003. https://www.site.uottawa.ca/~nat/Workshop2003/drummondc.pdf. Accessed 2 May 2021.
Biau O, D’elia A. Euro Area GDP Forecast Using Large Survey Dataset - A Random Forest Approach. EcoMod. EcoMod2010;259600029. www.econpapers.repec.org/paper/ekd002596/259600029.htm (accessed 28 Jul 2020).
Sideris N, Bardis G, Voulodimos A, Miaoulis G, Ghazanfarpour D. Using random forests on real-World City data for urban planning in a visual semantic decision support system. Sensors (Basel). 2019;19(10):2266. https://doi.org/10.3390/s19102266.
Henriksen DP, Laursen CB, Jensen TG, Hallas J, Pedersen C, Lassen AT. Incidence rate of community-acquired sepsis among hospitalized acute medical patients-a population-based survey. Crit Care Med. 2015;43(1):13–21. https://doi.org/10.1097/CCM.0000000000000611.
Martin GS, Mannino DM, Eaton S, Moss M. The epidemiology of Sepsis in the United States from 1979 through 2000. N Engl J Med. 2003;348(16):1546–54. https://doi.org/10.1056/NEJMoa022139.
The authors would like to acknowledge Örebro university for economic support. The authors would also like to give a special thanks to Jennifer Short and Anna Su for compiling the original data file, which the current study data is derived from.
Patient and public involvement
There was no direct involvement of patients in the current study.
This study was supported by the University of Örebro. The funders were not involved in interpreting the data, nor writing the manuscript. Open Access funding provided by Örebro University.
Ethics approval and consent to participate
Ethical approval was obtained from the regional review board in Stockholm (“Regionala Etikprövningsnämnden i Stockholm”), diary number 2012/1288.31/3 and 2015/1019–32. All methods were carried out in accordance with relevant guidelines and regulation. Informed consent was waivered by the regional review board in Stockholm as the current study was retrospective and based on a review of medical records.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Table describing all 91 variables included.
Predicting 7-day mortality - Mean AUC compared with number of variables included.
Predicting 30-day mortality - Mean AUC compared with number of variables included.
The receiver operating characteristic curve for predicting 7-day mortality.
The receiver operating characteristic curve for predicting 30-day mortality.
Sub-analysis for predicting 30-day mortality - the receiver operating characteristic curve. The predictive variable breathing difficulties is replaced with tachypnea.
About this article
Cite this article
Karlsson, A., Stassen, W., Loutfi, A. et al. Predicting mortality among septic patients presenting to the emergency department–a cross sectional analysis using machine learning. BMC Emerg Med 21, 84 (2021). https://doi.org/10.1186/s12873-021-00475-7
- Clinical assessment
- Emergency care systems
- Emergency department
- Infectious diseases