Inverse regression for extraction of tumor site from cancer pathology reports

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Pathology reports are the primary source of information for cancer diagnosis of millions of the cancer patients across the United States. Cancer registries label these reports every year. The coded labels incorporate pertinent information such as cancer location, behavior, and histology. This information when combined with clinical information, medical imaging and even genomic information have a great potential to fuel discoveries in cancer research. The coding process is manual and requires many human experts to label the large volume of pathology reports in a timely manner. In this study, we have developed a supervised inverse regression based auto-labeler to automate the task. The experiments were conducted on a set of 942 pathology reports with human expert labels as the ground truth. We observed that the inverse regression based auto-labeler consistently performed better than or comparable to the best performing state-of-The-Art method. These results demonstrate the potential of inverse regression for reliable information extraction from the pathology reports.

Original languageEnglish
Title of host publication2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728108483
DOIs
StatePublished - May 2019
Event2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Chicago, United States
Duration: May 19 2019May 22 2019

Publication series

Name2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings

Conference

Conference2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019
Country/TerritoryUnited States
CityChicago
Period05/19/1905/22/19

Funding

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). National Laboratory under Contract DE-AC02-06-CH11357

FundersFunder number
National Laboratory
National Institutes of Health
U.S. Department of Energy
National Cancer Institute
Argonne National LaboratoryDE-AC02-06-CH11357
Lawrence Livermore National LaboratoryDEAC52-07NA27344
Oak Ridge National LaboratoryDE-AC05-00OR22725
Los Alamos National LaboratoryDE-AC5206NA25396

    Keywords

    • Cancer pathology reports
    • Classification
    • Localized sliced inverse regression
    • Supervised subspace learning

    Fingerprint

    Dive into the research topics of 'Inverse regression for extraction of tumor site from cancer pathology reports'. Together they form a unique fingerprint.

    Cite this