Knowledge Graph-Enabled Cancer Data Analytics

S. M. Shamimul Hasan, Donna Rivera, Xiao Cheng Wu, Eric B. Durbin, J. Blair Christian, Georgia Tourassi

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

Cancer registries collect unstructured and structured cancer data for surveillance purposes which provide important insights regarding cancer characteristics, treatments, and outcomes. Cancer registry data typically (1) categorize each reportable cancer case or tumor at the time of diagnosis, (2) contain demographic information about the patient such as age, gender, and location at time of diagnosis, (3) include planned and completed primary treatment information, and (4) may contain survival outcomes. As structured data is being extracted from various unstructured sources, such as pathology reports, radiology reports, medical records, and stored for reporting and other needs, the associated information representing a reportable cancer is constantly expanding and evolving. While some popular analytic approaches including SEER*Stat and SAS exist, we provide a knowledge graph approach to organizing cancer registry data. Our approach offers unique advantages for timely data analysis and presentation and visualization of valuable information. This knowledge graph approach semantically enriches the data, and easily enables linking with third-party data which can help explain variation in cancer incidence patterns, disparities, and outcomes. We developed a prototype knowledge graph based on the Louisiana Tumor Registry dataset. We present the advantages of the knowledge graph approach by examining: i) scenario-specific queries, ii) links with openly available external datasets, iii) schema evolution for iterative analysis, and iv) data visualization. Our results demonstrate that this graph based solution can perform complex queries, improve query run-time performance by up to 76%, and more easily conduct iterative analyses to enhance researchers' understanding of cancer registry data.

Original languageEnglish
Article number9086146
Pages (from-to)1952-1967
Number of pages16
JournalIEEE Journal of Biomedical and Health Informatics
Volume24
Issue number7
DOIs
StatePublished - Jul 2020

Funding

Manuscript received November 15, 2019; revised March 12, 2020; accepted April 17, 2020. Date of publication May 4, 2020; date of current version July 2, 2020. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). (Corresponding author: S. M. Shamimul Hasan.) S.M.Shamimul Hasan, J. Blair Christian, and Georgia Tourassi are with Oak Ridge National Laboratory, Oak Ridge, TN 37830 USA (e-mail: [email protected]; [email protected]; [email protected]).

FundersFunder number
National Institutes of Health
U.S. Department of Energy
National Cancer InstituteP30CA177558
Argonne National LaboratoryDE-AC02-06-CH11357
Lawrence Livermore National LaboratoryDE-AC52-07NA27344
Oak Ridge National LaboratoryDE-AC05-00OR22725
Los Alamos National LaboratoryDE-AC5206NA25396

    Keywords

    • Knowledge graph
    • cancer registry
    • treatment

    Fingerprint

    Dive into the research topics of 'Knowledge Graph-Enabled Cancer Data Analytics'. Together they form a unique fingerprint.

    Cite this