Abstract
Sepsis is a life-threatening medical condition that, if not treated promptly, can result in tissue damage, organ failure, and death. According to the Centers for Disease Control, about 270,000 individuals die of sepsis in the US each year. Further, sepsis expenditures accounted for 13% of total US hospital costs in 2013, totaling more than $24 billion. Our project objectives were to determine if Machine Learning algorithms could reliably predict hospital stay duration for patients with sepsis. The data set we used has been de-identified and is freely available through the BupaR package. The data includes 1050 cases, 15214 events, and 16 types of actions related to sepsis patient care. First, we used process mining to determine how long each patient was in the hospital. Using BupaR's functions, we created several process model graphs. These process models depict the movement of patients at a hospital and provide duration data for each patent case. Second, we identified outlier data and created two dataset versions: one with and one without outliers. We then applied the following analysis methods: Linear Regression, Random Forest, K-Nearest Neighbors, Neural Networks, XGBoost, and lightGBM. We compared the model validations for the six machine learning models using the same data-splitting method. We found that the XGBoost model had the best prediction accuracy of 73.9 percent for cases with outliers, and 79 percent for cases without outliers. We also found that the lightGBM model had the lowest mean absolute error between prediction and actual duration in days with 3.66 days for the case with outliers, and 2.4 days for the case without outliers. These two models outperformed the other four models. This work will be enhanced in the future by exploring new prediction algorithms and comparing them with the results of this study.
Original language | English |
---|---|
Title of host publication | SoutheastCon 2022 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 302-309 |
Number of pages | 8 |
ISBN (Electronic) | 9781665406529 |
DOIs | |
State | Published - 2022 |
Event | SoutheastCon 2022 - Mobile, United States Duration: Mar 26 2022 → Apr 3 2022 |
Publication series
Name | Conference Proceedings - IEEE SOUTHEASTCON |
---|---|
Volume | 2022-March |
ISSN (Print) | 1091-0050 |
ISSN (Electronic) | 1558-058X |
Conference
Conference | SoutheastCon 2022 |
---|---|
Country/Territory | United States |
City | Mobile |
Period | 03/26/22 → 04/3/22 |
Funding
This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Workforce Development for Teachers and Scientists (WDTS) under the Science Undergraduate Laboratory Internships Program (SULI). This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Comparative Study
- Healthcare
- K- Nearest Neighbors
- Linear Regression
- Machine Learning
- Neural Networks
- Process Mining
- Random Forest
- XGBoost
- lightGBM