TY - GEN
T1 - Inverse regression for extraction of tumor site from cancer pathology reports
AU - Dubey, Abhishek K.
AU - Yoon, Hong Jun
AU - Tourassi, Georgia D.
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Pathology reports are the primary source of information for cancer diagnosis of millions of the cancer patients across the United States. Cancer registries label these reports every year. The coded labels incorporate pertinent information such as cancer location, behavior, and histology. This information when combined with clinical information, medical imaging and even genomic information have a great potential to fuel discoveries in cancer research. The coding process is manual and requires many human experts to label the large volume of pathology reports in a timely manner. In this study, we have developed a supervised inverse regression based auto-labeler to automate the task. The experiments were conducted on a set of 942 pathology reports with human expert labels as the ground truth. We observed that the inverse regression based auto-labeler consistently performed better than or comparable to the best performing state-of-The-Art method. These results demonstrate the potential of inverse regression for reliable information extraction from the pathology reports.
AB - Pathology reports are the primary source of information for cancer diagnosis of millions of the cancer patients across the United States. Cancer registries label these reports every year. The coded labels incorporate pertinent information such as cancer location, behavior, and histology. This information when combined with clinical information, medical imaging and even genomic information have a great potential to fuel discoveries in cancer research. The coding process is manual and requires many human experts to label the large volume of pathology reports in a timely manner. In this study, we have developed a supervised inverse regression based auto-labeler to automate the task. The experiments were conducted on a set of 942 pathology reports with human expert labels as the ground truth. We observed that the inverse regression based auto-labeler consistently performed better than or comparable to the best performing state-of-The-Art method. These results demonstrate the potential of inverse regression for reliable information extraction from the pathology reports.
KW - Cancer pathology reports
KW - Classification
KW - Localized sliced inverse regression
KW - Supervised subspace learning
UR - http://www.scopus.com/inward/record.url?scp=85073005685&partnerID=8YFLogxK
U2 - 10.1109/BHI.2019.8834527
DO - 10.1109/BHI.2019.8834527
M3 - Conference contribution
AN - SCOPUS:85073005685
T3 - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings
BT - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2019
Y2 - 19 May 2019 through 22 May 2019
ER -