Abstract
This study introduces a sophisticated data-driven framework for analyzing Electronic Health Records (EHRs) using transformer-based models to identify and disentangle overlapping treatment contexts. The framework leverages a preprocessing pipeline that transforms structured procedural codes into semantically enriched descriptive text, enabling the use of attention mechanisms to cluster medical events into treatment milestones - cohesive and distinct components of care processes. The methodology is rigorously validated using synthetic datasets derived from the MIMIC-III database, designed to simulate the heterogeneity and overlapping procedural contexts characteristic of real-world EHR scenarios. Quantitative evaluation highlights the framework's robustness in disentangling concurrent care pathways, with attention metrics and unsupervised clustering approaches demonstrating the ability to preserve intra-context relationships while distinguishing inter-context dependencies. By addressing challenges inherent in data heterogeneity, this approach provides a foundation for uncovering complex treatment patterns, advancing clinical decision-making, and optimizing resource allocation in diverse healthcare environments.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024 |
| Editors | Wei Ding, Chang-Tien Lu, Fusheng Wang, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 8707-8709 |
| Number of pages | 3 |
| ISBN (Electronic) | 9798350362480 |
| DOIs | |
| State | Published - 2024 |
| Event | 2024 IEEE International Conference on Big Data, BigData 2024 - Washington, United States Duration: Dec 15 2024 → Dec 18 2024 |
Publication series
| Name | Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024 |
|---|
Conference
| Conference | 2024 IEEE International Conference on Big Data, BigData 2024 |
|---|---|
| Country/Territory | United States |
| City | Washington |
| Period | 12/15/24 → 12/18/24 |
Funding
This work is sponsored by the U.S. Department of Veterans Affairs using resources from the Knowledge Discovery Infrastructure, located at Oak Ridge National Laboratory and supported by the Office of Science of the U.S. Department of Energy (DOE). This manuscript has been authored by UTBattelle, LLC, under contract DE-AC05-00OR22725 with the U.S. Department of Energy. The U.S. Government retains and the publisher, by accepting this article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Attention Mechanisms
- BERT
- Clinical Pathways
- Data-Driven Healthcare
- Electronic Health Records
- Machine Learning
- Transformer Models
- Treatment Milestones