Abstract
We describe a data-driven unsupervised machine learning approach to extract geo-temporal co-occurrence patterns of asthma and the flu from large-scale electronic healthcare reimbursement claims (eHRC) datasets. Specifically, we examine the eHRC data from 2009 to 2010 pandemic H1N1 influenza season and analyze whether different geographic regions within the United States (US) showed an increase in co-occurrence patterns of the flu and asthma. Our analyses reveal that the temporal patterns extracted from the eHRC data show a distinct lag time between the peak incidence of the asthma and the flu. While the increased occurrence of asthma contributed to increased flu incidence during the pandemic, this co-occurrence is predominant for female patients. The geo-temporal patterns reveal that the co-occurrence of the flu and asthma are typically concentrated within the south-east US. Further, in agreement with previous studies, large urban areas (such as New York, Miami, and Los Angeles) exhibit co-occurrence patterns that suggest a peak incidence of asthma and flu significantly early in the spring and winter seasons. Together, our data-analytic approach, integrated within the Oak Ridge Bio-surveillance Toolkit platform, demonstrates how eHRC data can provide novel insights into co-occurring disease patterns.
Original language | English |
---|---|
Article number | 182 |
Journal | Frontiers in Public Health |
Volume | 3 |
DOIs | |
State | Published - Aug 3 2015 |
Funding
Funding: Preparation of this paper was funded by ORNL internal SEED project number 7280, “Demonstrating a Novel BioDefense Capability using Public Health Data Informatics.” This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, Funding : Preparation of this paper was funded by ORNL internal SEED project number 7280, “Demonstrating a Novel Bio-Defense Capability using Public Health Data Informatics.” This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- asthma
- disease co-occurrence
- electronic healthcare reimbursement claims
- flu
- non-negative matrix factorization
- public health surveillance