TY - GEN
T1 - Six Machine-Learning Methods for Predicting Hospital-Stay Duration for Patients with Sepsis
T2 - SoutheastCon 2022
AU - Chen, Lingtao
AU - Klasky, Hilda B.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Sepsis is a life-threatening medical condition that, if not treated promptly, can result in tissue damage, organ failure, and death. According to the Centers for Disease Control, about 270,000 individuals die of sepsis in the US each year. Further, sepsis expenditures accounted for 13% of total US hospital costs in 2013, totaling more than $24 billion. Our project objectives were to determine if Machine Learning algorithms could reliably predict hospital stay duration for patients with sepsis. The data set we used has been de-identified and is freely available through the BupaR package. The data includes 1050 cases, 15214 events, and 16 types of actions related to sepsis patient care. First, we used process mining to determine how long each patient was in the hospital. Using BupaR's functions, we created several process model graphs. These process models depict the movement of patients at a hospital and provide duration data for each patent case. Second, we identified outlier data and created two dataset versions: one with and one without outliers. We then applied the following analysis methods: Linear Regression, Random Forest, K-Nearest Neighbors, Neural Networks, XGBoost, and lightGBM. We compared the model validations for the six machine learning models using the same data-splitting method. We found that the XGBoost model had the best prediction accuracy of 73.9 percent for cases with outliers, and 79 percent for cases without outliers. We also found that the lightGBM model had the lowest mean absolute error between prediction and actual duration in days with 3.66 days for the case with outliers, and 2.4 days for the case without outliers. These two models outperformed the other four models. This work will be enhanced in the future by exploring new prediction algorithms and comparing them with the results of this study.
AB - Sepsis is a life-threatening medical condition that, if not treated promptly, can result in tissue damage, organ failure, and death. According to the Centers for Disease Control, about 270,000 individuals die of sepsis in the US each year. Further, sepsis expenditures accounted for 13% of total US hospital costs in 2013, totaling more than $24 billion. Our project objectives were to determine if Machine Learning algorithms could reliably predict hospital stay duration for patients with sepsis. The data set we used has been de-identified and is freely available through the BupaR package. The data includes 1050 cases, 15214 events, and 16 types of actions related to sepsis patient care. First, we used process mining to determine how long each patient was in the hospital. Using BupaR's functions, we created several process model graphs. These process models depict the movement of patients at a hospital and provide duration data for each patent case. Second, we identified outlier data and created two dataset versions: one with and one without outliers. We then applied the following analysis methods: Linear Regression, Random Forest, K-Nearest Neighbors, Neural Networks, XGBoost, and lightGBM. We compared the model validations for the six machine learning models using the same data-splitting method. We found that the XGBoost model had the best prediction accuracy of 73.9 percent for cases with outliers, and 79 percent for cases without outliers. We also found that the lightGBM model had the lowest mean absolute error between prediction and actual duration in days with 3.66 days for the case with outliers, and 2.4 days for the case without outliers. These two models outperformed the other four models. This work will be enhanced in the future by exploring new prediction algorithms and comparing them with the results of this study.
KW - Comparative Study
KW - Healthcare
KW - K- Nearest Neighbors
KW - Linear Regression
KW - Machine Learning
KW - Neural Networks
KW - Process Mining
KW - Random Forest
KW - XGBoost
KW - lightGBM
UR - http://www.scopus.com/inward/record.url?scp=85129887897&partnerID=8YFLogxK
U2 - 10.1109/SoutheastCon48659.2022.9764052
DO - 10.1109/SoutheastCon48659.2022.9764052
M3 - Conference contribution
AN - SCOPUS:85129887897
T3 - Conference Proceedings - IEEE SOUTHEASTCON
SP - 302
EP - 309
BT - SoutheastCon 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 March 2022 through 3 April 2022
ER -