GASP: Graph-Based Approximate Sequential Pattern Mining for Electronic Health Records

Wenqin Dong, Eric W. Lee, Vicki Stover Hertzberg, Roy L. Simpson, Joyce C. Ho

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Sequential pattern mining can be used to extract meaningful sequences from electronic health records. However, conventional sequential pattern mining algorithms that discover all frequent sequential patterns can incur a high computational and be susceptible to noise in the observations. Approximate sequential pattern mining techniques have been introduced to address these shortcomings yet, existing approximate methods fail to reflect the true frequent sequential patterns or only target single-item event sequences. Multi-item event sequences are prominent in healthcare as a patient can have multiple interventions for a single visit. To alleviate these issues, we propose GASP, a graph-based approximate sequential pattern mining, that discovers frequent patterns for multi-item event sequences. Our approach compresses the sequential information into a concise graph structure which has computational benefits. The empirical results on two healthcare datasets suggest that GASP outperforms existing approximate models by improving recoverability and extracts better predictive patterns.

Original languageEnglish
Title of host publicationNew Trends in Database and Information Systems - ADBIS 2021 Short Papers, Doctoral Consortium and Workshops
Subtitle of host publicationDOING, SIMPDA, MADEISD, MegaData, CAoNS, Proceedings
EditorsLadjel Bellatreche, Marlon Dumas, Panagiotis Karras, Raimundas Matulevičius
PublisherSpringer Science and Business Media Deutschland GmbH
Pages50-60
Number of pages11
ISBN (Print)9783030850814
DOIs
StatePublished - 2021
Externally publishedYes
Event25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021 co-allocated with Workshops on DOING, SIMPDA, MADEISD, MegaData, CAoNS 2021 - Tartu, Estonia
Duration: Aug 24 2021Aug 26 2021

Publication series

NameCommunications in Computer and Information Science
Volume1450 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021 co-allocated with Workshops on DOING, SIMPDA, MADEISD, MegaData, CAoNS 2021
Country/TerritoryEstonia
CityTartu
Period08/24/2108/26/21

Funding

Acknowledgements. This work was supported by the National Science Foundation award IIS-#1838200 and the National Institutes of Health (NIH) awards 1R01LM013323 and 5K01LM012924.

Keywords

  • Healthcare data
  • Sequential pattern mining

Fingerprint

Dive into the research topics of 'GASP: Graph-Based Approximate Sequential Pattern Mining for Electronic Health Records'. Together they form a unique fingerprint.

Cite this