Abstract
Pathology reports are the primary source of information for cancer diagnosis of millions of the cancer patients across the United States. Cancer registries label these reports every year. The coded labels incorporate pertinent information such as cancer location, behavior, and histology. This information when combined with clinical information, medical imaging and even genomic information have a great potential to fuel discoveries in cancer research. The coding process is manual and requires many human experts to label the large volume of pathology reports in a timely manner. In this study, we have developed a supervised inverse regression based auto-labeler to automate the task. The experiments were conducted on a set of 942 pathology reports with human expert labels as the ground truth. We observed that the inverse regression based auto-labeler consistently performed better than or comparable to the best performing state-of-The-Art method. These results demonstrate the potential of inverse regression for reliable information extraction from the pathology reports.
Original language | English |
---|---|
Title of host publication | 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781728108483 |
DOIs | |
State | Published - May 2019 |
Event | 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Chicago, United States Duration: May 19 2019 → May 22 2019 |
Publication series
Name | 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings |
---|
Conference
Conference | 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 |
---|---|
Country/Territory | United States |
City | Chicago |
Period | 05/19/19 → 05/22/19 |
Funding
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). National Laboratory under Contract DE-AC02-06-CH11357
Keywords
- Cancer pathology reports
- Classification
- Localized sliced inverse regression
- Supervised subspace learning