Abstract
Pathology reports are a primary source of information for cancer registries which process high volumes of free-text reports annually. Information extraction and coding is a manual, labor-intensive process. In this talk I will discuss the latest deep learning technology, presenting both theoretical and practical perspectives that are relevant to natural language processing of clinical pathology reports. Using different deep learning architectures, I will present benchmark studies for various information extraction tasks and discuss their importance in supporting a comprehensive and scalable national cancer surveillance program.
Original language | English |
---|---|
Title of host publication | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |
Editors | Jian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 3982-3983 |
Number of pages | 2 |
ISBN (Electronic) | 9781538627143 |
DOIs | |
State | Published - Jul 1 2017 |
Event | 5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States Duration: Dec 11 2017 → Dec 14 2017 |
Publication series
Name | Proceedings - 2017 IEEE International Conference on Big Data, Big Data 2017 |
---|---|
Volume | 2018-January |
Conference
Conference | 5th IEEE International Conference on Big Data, Big Data 2017 |
---|---|
Country/Territory | United States |
City | Boston |
Period | 12/11/17 → 12/14/17 |
Funding
ACKNOWLEDGMENT This work has been supported in part by the Joint Design of Advanced Computing Solutions (JDASC4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health. The authors wish to thank Valentina Petkov of the Surveillance Research Program from the National Cancer Institute and the SEER registries at HI, KY, CT, NM and Seattle for the de-identified pathology reports used in this investigation. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S., Department of Energy under Contract No. DEAC05-00OR22725.
Keywords
- cancer
- deep learning
- natural language processing
- surveillance