Abstract
Sequential pattern mining can be used to extract meaningful sequences from electronic health records. However, conventional sequential pattern mining algorithms that discover all frequent sequential patterns can incur a high computational and be susceptible to noise in the observations. Approximate sequential pattern mining techniques have been introduced to address these shortcomings yet, existing approximate methods fail to reflect the true frequent sequential patterns or only target single-item event sequences. Multi-item event sequences are prominent in healthcare as a patient can have multiple interventions for a single visit. To alleviate these issues, we propose GASP, a graph-based approximate sequential pattern mining, that discovers frequent patterns for multi-item event sequences. Our approach compresses the sequential information into a concise graph structure which has computational benefits. The empirical results on two healthcare datasets suggest that GASP outperforms existing approximate models by improving recoverability and extracts better predictive patterns.
Original language | English |
---|---|
Title of host publication | New Trends in Database and Information Systems - ADBIS 2021 Short Papers, Doctoral Consortium and Workshops |
Subtitle of host publication | DOING, SIMPDA, MADEISD, MegaData, CAoNS, Proceedings |
Editors | Ladjel Bellatreche, Marlon Dumas, Panagiotis Karras, Raimundas Matulevičius |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 50-60 |
Number of pages | 11 |
ISBN (Print) | 9783030850814 |
DOIs | |
State | Published - 2021 |
Externally published | Yes |
Event | 25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021 co-allocated with Workshops on DOING, SIMPDA, MADEISD, MegaData, CAoNS 2021 - Tartu, Estonia Duration: Aug 24 2021 → Aug 26 2021 |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1450 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | 25th East-European Conference on Advances in Databases and Information Systems, ADBIS 2021 co-allocated with Workshops on DOING, SIMPDA, MADEISD, MegaData, CAoNS 2021 |
---|---|
Country/Territory | Estonia |
City | Tartu |
Period | 08/24/21 → 08/26/21 |
Funding
Acknowledgements. This work was supported by the National Science Foundation award IIS-#1838200 and the National Institutes of Health (NIH) awards 1R01LM013323 and 5K01LM012924.
Keywords
- Healthcare data
- Sequential pattern mining