Privacy-preserving Sequential Pattern Mining in distributed EHRs for Predicting Cardiovascular Disease

Eric W. Lee, Li Xiong, Vicki Stover Hertzberg, Roy L. Simpson, Joyce C. Ho

Research output: Contribution to journalArticlepeer-review

6 Scopus citations

Abstract

From electronic health records (EHRs), the relationship between patients' conditions, treatments, and outcomes can be discovered and used in various healthcare research tasks such as risk prediction. In practice, EHRs can be stored in one or more data warehouses, and mining from distributed data sources becomes challenging. Another challenge arises from privacy laws because patient data cannot be used without some patient privacy guarantees. Thus, in this paper, we propose a privacy-preserving framework using sequential pattern mining in distributed data sources. Our framework extracts patterns from each source and shares patterns with other sources to discover discriminative and representative patterns that can be used for risk prediction while preserving privacy. We demonstrate our framework using a case study of predicting Cardiovascular Disease in patients with type 2 diabetes and show the effectiveness of our framework with several sources and by applying differential privacy mechanisms.

Original languageEnglish
Pages (from-to)384-393
Number of pages10
JournalAMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium
Volume2021
StatePublished - 2021
Externally publishedYes

Funding

FundersFunder number
National Institute of General Medical SciencesR01GM118609

    Fingerprint

    Dive into the research topics of 'Privacy-preserving Sequential Pattern Mining in distributed EHRs for Predicting Cardiovascular Disease'. Together they form a unique fingerprint.

    Cite this