Abstract
Reinforcement learning is a powerful tool for developing personalized treatment regimens from healthcare data. Yet training reinforcement learning agents through direct interactions with patients is often impractical for ethical reasons. One solution is to train reinforcement learning agents using an 'environment model,' which is learned from retrospective patient data, and can simulate realistic patient trajectories. In this study, we propose transitional variational autoencoders (tVAE), a generative neural network architecture that learns a direct mapping between distributions over clinical measurements at adjacent time points. Unlike other models, the tVAE requires few distributional assumptions, and benefits from identical training, and testing architectures. This model produces more realistic patient trajectories than state-of-the-art sequential decision-making models, and generative neural networks, and can be used to learn effective treatment policies.
Original language | English |
---|---|
Article number | 9209034 |
Pages (from-to) | 2273-2280 |
Number of pages | 8 |
Journal | IEEE Journal of Biomedical and Health Informatics |
Volume | 25 |
Issue number | 6 |
DOIs | |
State | Published - Jun 2021 |
Funding
Manuscript received April 21, 2020; revised August 4, 2020 and September 21, 2020; accepted September 22, 2020. Date of publication September 29, 2020; date of current version June 4, 2021. This work was supported in part by Science Alliance, The University of Tennessee and in part by the Laboratory Directed Research, and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy. (Corresponding author: Anahita Khojandi.) Matthew Baucum and Anahita Khojandi are with the Department of Industrial, and Systems Engineering, University of Tennessee, Knoxville, TN 37996 USA (e-mail: [email protected]; [email protected]).
Funders | Funder number |
---|---|
Laboratory Directed Research | |
Science Alliance | |
U.S. Department of Energy | |
Oak Ridge National Laboratory | |
University of Tennessee |
Keywords
- Reinforcement learning
- generative adversarial networks
- hidden Markov models
- long short-term memory networks
- variational autoencoders