Information Extraction from Cancer Pathology Reports with Graph Convolution Networks for Natural Language Texts

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Graph-of-words is a flexible and efficient text representation which addresses well-known challenges, such as word ordering and variation of expressions, to natural language processing. In this paper, we consider the latest graph-based convolutional neural network technique, the Text GraphConvolutional Network (Text GCN), in the context of performingclassification tasks on free-form natural language texts. To do this, we designed a study of multi-task information extraction from medical text documents. We implemented multi-task learning in the Text GCN, performed hyperparameter optimization, and measured the clinical task performances. We evaluated micro and macro-F1 scores of four information extraction tasks,including subsite, laterality, behavior, and histological grades from cancer pathology reports. The scores for the Text GCN significantly outperformed our previous studies with convolutional neural networks, suggesting that the Text GCN model is superior to traditional models in task performance.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019
EditorsChaitanya Baru, Jun Huan, Latifur Khan, Xiaohua Tony Hu, Ronay Ak, Yuanyuan Tian, Roger Barga, Carlo Zaniolo, Kisung Lee, Yanfang Fanny Ye
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4561-4564
Number of pages4
ISBN (Electronic)9781728108582
DOIs
StatePublished - Dec 2019
Event2019 IEEE International Conference on Big Data, Big Data 2019 - Los Angeles, United States
Duration: Dec 9 2019Dec 12 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Big Data, Big Data 2019

Conference

Conference2019 IEEE International Conference on Big Data, Big Data 2019
Country/TerritoryUnited States
CityLos Angeles
Period12/9/1912/12/19

Funding

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of National Institutes of Health. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725.

Keywords

  • biomedical informatics
  • convolutional neural network
  • graph convolutional network
  • natural language processing
  • text classification

Fingerprint

Dive into the research topics of 'Information Extraction from Cancer Pathology Reports with Graph Convolution Networks for Natural Language Texts'. Together they form a unique fingerprint.

Cite this