Abstract
Individuals with coronavirus disease 2019 (COVID-19) infection present in a variety of ways, ranging from asymptomatic or mild cough, to organ failure or death. One of the major challenges for the medical community is the quick and accurate determination of how COVID-19 will progress in an individual. Herein, we introduce a new Cut-and-Solve based feature selection program for identifying predictive feature sets in heterogeneous data. We analyze proteomics data from Washington University to identify models ranging in size from a single feature up to five. Validation of logistic regression models using area under the curve (AUC) were applied for both a holdout data set and an independent data set from Massachusetts General Hospital. A variety of known and novel biomarkers for COVID-19 severity were identified. The best model for predicting severe (ventilation or death) vs. non-severe infection is achieved for CALCOCO2 and STC1, with an average AUC=0.81. Based on the known severity markers, several different proteomic pathways are identified. Enrichment analysis indicates activity associated with inflammatory response, as well as myelination and cardiac function.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 |
Editors | Xingpeng Jiang, Haiying Wang, Reda Alhajj, Xiaohua Hu, Felix Engel, Mufti Mahmud, Nadia Pisanti, Xuefeng Cui, Hong Song |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 3370-3375 |
Number of pages | 6 |
ISBN (Electronic) | 9798350337488 |
DOIs | |
State | Published - 2023 |
Event | 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 - Istanbul, Turkey Duration: Dec 5 2023 → Dec 8 2023 |
Publication series
Name | Proceedings - 2023 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 |
---|
Conference
Conference | 2023 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2023 |
---|---|
Country/Territory | Turkey |
City | Istanbul |
Period | 12/5/23 → 12/8/23 |
Funding
VI. ACKNOWLEDGEMENTS The computation for this work was performed on the high performance computing infrastructure provided by Research Computing Support Services and in part by the National Science Foundation under grant number CNS-1429294 at the University of Missouri, Columbia MO. DOI: https://doi.org/10.32469/10355/69802
Keywords
- COVID-19
- Feature Selection
- Mixed Integer Programming