Abstract
Health Information Technology (HIT) aims to improve healthcare outcomes by organizing and analyzing various health-related data. With data accumulating at a staggering rate, the importance of real-time analytics has been increasing dramatically, shifting the focus of informatics from batch processing to streaming analytics. HIT is also facing unprecedented challenges in adapting to this new requirement and leveraging advanced IT technologies. This paper introduces a HIT data and compute platform that supports multi-granularity real-time analytics from heterogeneous data sources. The paper first identifies functional requirements and proposes a framework that satisfies the requirements using state-of-the-art big data technologies including Apache Kafka, Spark Structured Streaming Engine, and Delta Lake. To demonstrate its capability to support data analytics in multiple time granularities analytics, a statistical process control-based hazard detection algorithm has been implemented on top of the framework to detect unexpected hazards from order cancellation data of the Department of US Veterans Affairs (VA) in near real-time.
Original language | English |
---|---|
Title of host publication | Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020 |
Editors | Xintao Wu, Chris Jermaine, Li Xiong, Xiaohua Tony Hu, Olivera Kotevska, Siyuan Lu, Weijia Xu, Srinivas Aluru, Chengxiang Zhai, Eyhab Al-Masri, Zhiyuan Chen, Jeff Saltz |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781728162515 |
DOIs | |
State | Published - Dec 10 2020 |
Event | 8th IEEE International Conference on Big Data, Big Data 2020 - Virtual, Atlanta, United States Duration: Dec 10 2020 → Dec 13 2020 |
Publication series
Name | Proceedings - 2020 IEEE International Conference on Big Data, Big Data 2020 |
---|---|
Volume | 2020-January |
Conference
Conference | 8th IEEE International Conference on Big Data, Big Data 2020 |
---|---|
Country/Territory | United States |
City | Virtual, Atlanta |
Period | 12/10/20 → 12/13/20 |
Funding
ACKNOWLEDGMENT This work is sponsored by the US Department of Veterans Affairs under Inter-Agency Agreement number VA118-17-M-2015. Notice: This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Big Data
- HIT
- Streaming Architecture