Six Machine-Learning Methods for Predicting Hospital-Stay Duration for Patients with Sepsis: A Comparative Study

Lingtao Chen, Hilda B. Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Sepsis is a life-threatening medical condition that, if not treated promptly, can result in tissue damage, organ failure, and death. According to the Centers for Disease Control, about 270,000 individuals die of sepsis in the US each year. Further, sepsis expenditures accounted for 13% of total US hospital costs in 2013, totaling more than $24 billion. Our project objectives were to determine if Machine Learning algorithms could reliably predict hospital stay duration for patients with sepsis. The data set we used has been de-identified and is freely available through the BupaR package. The data includes 1050 cases, 15214 events, and 16 types of actions related to sepsis patient care. First, we used process mining to determine how long each patient was in the hospital. Using BupaR's functions, we created several process model graphs. These process models depict the movement of patients at a hospital and provide duration data for each patent case. Second, we identified outlier data and created two dataset versions: one with and one without outliers. We then applied the following analysis methods: Linear Regression, Random Forest, K-Nearest Neighbors, Neural Networks, XGBoost, and lightGBM. We compared the model validations for the six machine learning models using the same data-splitting method. We found that the XGBoost model had the best prediction accuracy of 73.9 percent for cases with outliers, and 79 percent for cases without outliers. We also found that the lightGBM model had the lowest mean absolute error between prediction and actual duration in days with 3.66 days for the case with outliers, and 2.4 days for the case without outliers. These two models outperformed the other four models. This work will be enhanced in the future by exploring new prediction algorithms and comparing them with the results of this study.

Original languageEnglish
Title of host publicationSoutheastCon 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages302-309
Number of pages8
ISBN (Electronic)9781665406529
DOIs
StatePublished - 2022
EventSoutheastCon 2022 - Mobile, United States
Duration: Mar 26 2022Apr 3 2022

Publication series

NameConference Proceedings - IEEE SOUTHEASTCON
Volume2022-March
ISSN (Print)1091-0050
ISSN (Electronic)1558-058X

Conference

ConferenceSoutheastCon 2022
Country/TerritoryUnited States
CityMobile
Period03/26/2204/3/22

Funding

This work was supported in part by the U.S. Department of Energy, Office of Science, Office of Workforce Development for Teachers and Scientists (WDTS) under the Science Undergraduate Laboratory Internships Program (SULI). This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

FundersFunder number
Office of Workforce Development for Teachers
U.S. Department of Energy
Office of Science

    Keywords

    • Comparative Study
    • Healthcare
    • K- Nearest Neighbors
    • Linear Regression
    • Machine Learning
    • Neural Networks
    • Process Mining
    • Random Forest
    • XGBoost
    • lightGBM

    Fingerprint

    Dive into the research topics of 'Six Machine-Learning Methods for Predicting Hospital-Stay Duration for Patients with Sepsis: A Comparative Study'. Together they form a unique fingerprint.

    Cite this