Automated histologic grading from free-text pathology reports using graph-of-words features and machine learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Traditional n-gram feature representation of freetext documents often fails to capture word ordering and semantics, thus compromising text comprehension. Graph-of-words, a new text representation approach based on graph analytics, is a superior method overcoming the limitations by modeling word co-occurrence. In this study, we present a novel application of graph-of-words text description for automated extraction of histologic grade from unstructured pathology reports. Using 10-fold cross-validation tests, the proposed approach resulted in substantially higher macro and micro-F1 scores with undirected graph-of-words features, compared to traditional bi-gram text features. Our feasibility study demonstrated that graph-of-words is a highly efficient method of text comprehension for information extraction from free-text clinical documents.

Original languageEnglish
Title of host publication2017 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages369-372
Number of pages4
ISBN (Electronic)9781509041794
DOIs
StatePublished - Apr 11 2017
Event4th IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017 - Orlando, United States
Duration: Feb 16 2017Feb 19 2017

Publication series

Name2017 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017

Conference

Conference4th IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017
Country/TerritoryUnited States
CityOrlando
Period02/16/1702/19/17

Fingerprint

Dive into the research topics of 'Automated histologic grading from free-text pathology reports using graph-of-words features and machine learning'. Together they form a unique fingerprint.

Cite this