High dimensional predictions of suicide risk in 4.2 million US Veterans using ensemble transfer learning

Sayera Dhaubhadel, Kumkum Ganguly, Ruy M. Ribeiro, Judith D. Cohn, James M. Hyman, Nicolas W. Hengartner, Beauty Kolade, Anna Singley, Tanmoy Bhattacharya, Patrick Finley, Drew Levin, Haedi Thelen, Kelly Cho, Lauren Costa, Yuk Lam Ho, Amy C. Justice, John Pestian, Daniel Santel, Rafael Zamora-Resendiz, Silvia CrivelliSuzanne Tamang, Susana Martins, Jodie Trafton, David W. Oslin, Jean C. Beckham, Nathan A. Kimbrel, Khushbu Agarwal, Allison E. Ashley-Koch, Mihaela Aslan, Edmond Begoli, Ben Brown, Patrick S. Calhoun, Kei Hoi Cheung, Sutanay Choudhury, Ashley M. Cliff, Leticia Cuellar-Hengartner, Haedi E. Deangelis, Michelle F. Dennis, Patrick D. Finley, Michael R. Garvin, Joel E. Gelernter, Lauren P. Hair, Colby Ham, Phillip D. Harvey, Elizabeth R. Hauser, Michael A. Hauser, Nick W. Hengartner, Daniel A. Jacobson, Jessica Jones, Piet C. Jones, David Kainer, Alan D. Kaplan, Ira R. Katz, Rachel L. Kember, Angela C. Kirby, John C. Ko, John Lagergren, Matthew Lane, Daniel F. Levey, Jennifer H. Lindquist, Xianlian Liu, Ravi K. Madduri, Carrie Manore, Carianne Martinez, John F. McCarthy, Mikaela Mc Devitt Cashman, J. Izaak Miller, Destinee Morrow, Mirko Pavicic-Venegas, Saiju Pyarajan, Xue J. Qin, Nallakkandi Rajeevan, Christine M. Ramsey, Ruy Ribeiro, Alex Rodriguez, Jonathon Romero, Yunling Shi, Murray B. Stein, Kyle A. Sullivan, Ning Sun, Suzanne R. Tamang, Alice Townsend, Jodie A. Trafton, Angelica Walker, Xiange Wang, Victoria Wangia-Anderson, Renji Yang, Shinjae Yoo, Hongyu Zhao, Benjamin H. McMahon

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

We present an ensemble transfer learning method to predict suicide from Veterans Affairs (VA) electronic medical records (EMR). A diverse set of base models was trained to predict a binary outcome constructed from reported suicide, suicide attempt, and overdose diagnoses with varying choices of study design and prediction methodology. Each model used twenty cross-sectional and 190 longitudinal variables observed in eight time intervals covering 7.5 years prior to the time of prediction. Ensembles of seven base models were created and fine-tuned with ten variables expected to change with study design and outcome definition in order to predict suicide and combined outcome in a prospective cohort. The ensemble models achieved c-statistics of 0.73 on 2-year suicide risk and 0.83 on the combined outcome when predicting on a prospective cohort of ∼ 4.2 M veterans. The ensembles rely on nonlinear base models trained using a matched retrospective nested case-control (Rcc) study cohort and show good calibration across a diversity of subgroups, including risk strata, age, sex, race, and level of healthcare utilization. In addition, a linear Rcc base model provided a rich set of biological predictors, including indicators of suicide, substance use disorder, mental health diagnoses and treatments, hypoxia and vascular damage, and demographics.

Original languageEnglish
Article number1793
JournalScientific Reports
Volume14
Issue number1
DOIs
StatePublished - Dec 2024

Funding

This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration, supported by award #MVP011. This publication does not represent the views of the Department of Veteran Affairs or the United States Government. J.C. Beckham was also supported by a Senior Research Career Scientist Award (#lK6BX003777) from CSR &D. We would like to thank Eric Caine for helpful discussions during problem formulation and Ethan Romero–Severson for helpful critiques on multiple drafts of this manuscript. This manuscript has been authored by Triad National Security, LLC under Contract No. 89233218CNA000001 with the U.S. Department of Energy/National Nuclear Security Administration. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript or allow others to do so, for United States Government purposes. The Government has also granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in the code and data, within this manuscript, to reproduce, prepare derivative works, and perform publicly and display publicly, by or on behalf of the Government. NEITHER THE GOVERNMENT NOR THE CONTRACTOR MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LIABILITY FOR THE USE OF CODE WITHIN THIS MANUSCRIPT. This notice including this sentence must appear on any copies of the code. This research is based on data from the Million Veteran Program, Office of Research and Development, Veterans Health Administration, supported by award #MVP011. This publication does not represent the views of the Department of Veteran Affairs or the United States Government. J.C. Beckham was also supported by a Senior Research Career Scientist Award (#lK6BX003777) from CSR &D. We would like to thank Eric Caine for helpful discussions during problem formulation and Ethan Romero–Severson for helpful critiques on multiple drafts of this manuscript. This manuscript has been authored by Triad National Security, LLC under Contract No. 89233218CNA000001 with the U.S. Department of Energy/National Nuclear Security Administration. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript or allow others to do so, for United States Government purposes. The Government has also granted for itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide license in the code and data, within this manuscript, to reproduce, prepare derivative works, and perform publicly and display publicly, by or on behalf of the Government. NEITHER THE GOVERNMENT NOR THE CONTRACTOR MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LIABILITY FOR THE USE OF CODE WITHIN THIS MANUSCRIPT. This notice including this sentence must appear on any copies of the code.

FundersFunder number
Million Veteran Program
United States Government
Center for Scientific Review89233218CNA000001
Center for Scientific Review
National Nuclear Security Administration
Office of Research and Development
Health Services Research and Development6BX003777
Health Services Research and Development

    Fingerprint

    Dive into the research topics of 'High dimensional predictions of suicide risk in 4.2 million US Veterans using ensemble transfer learning'. Together they form a unique fingerprint.

    Cite this