Improving Deep Reinforcement Learning with Transitional Variational Autoencoders: A Healthcare Application

Matthew Baucum, Anahita Khojandi, Rama Vasudevan

Research output: Contribution to journalArticlepeer-review

22 Scopus citations

Abstract

Reinforcement learning is a powerful tool for developing personalized treatment regimens from healthcare data. Yet training reinforcement learning agents through direct interactions with patients is often impractical for ethical reasons. One solution is to train reinforcement learning agents using an 'environment model,' which is learned from retrospective patient data, and can simulate realistic patient trajectories. In this study, we propose transitional variational autoencoders (tVAE), a generative neural network architecture that learns a direct mapping between distributions over clinical measurements at adjacent time points. Unlike other models, the tVAE requires few distributional assumptions, and benefits from identical training, and testing architectures. This model produces more realistic patient trajectories than state-of-the-art sequential decision-making models, and generative neural networks, and can be used to learn effective treatment policies.

Original languageEnglish
Article number9209034
Pages (from-to)2273-2280
Number of pages8
JournalIEEE Journal of Biomedical and Health Informatics
Volume25
Issue number6
DOIs
StatePublished - Jun 2021

Funding

Manuscript received April 21, 2020; revised August 4, 2020 and September 21, 2020; accepted September 22, 2020. Date of publication September 29, 2020; date of current version June 4, 2021. This work was supported in part by Science Alliance, The University of Tennessee and in part by the Laboratory Directed Research, and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy. (Corresponding author: Anahita Khojandi.) Matthew Baucum and Anahita Khojandi are with the Department of Industrial, and Systems Engineering, University of Tennessee, Knoxville, TN 37996 USA (e-mail: [email protected]; [email protected]).

FundersFunder number
Laboratory Directed Research
Science Alliance
U.S. Department of Energy
Oak Ridge National Laboratory
University of Tennessee

    Keywords

    • Reinforcement learning
    • generative adversarial networks
    • hidden Markov models
    • long short-term memory networks
    • variational autoencoders

    Fingerprint

    Dive into the research topics of 'Improving Deep Reinforcement Learning with Transitional Variational Autoencoders: A Healthcare Application'. Together they form a unique fingerprint.

    Cite this