TY - GEN
T1 - Automated histologic grading from free-text pathology reports using graph-of-words features and machine learning
AU - Yoon, Hong Jun
AU - Roberts, Larry
AU - Tourassi, Georgia
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/4/11
Y1 - 2017/4/11
N2 - Traditional n-gram feature representation of freetext documents often fails to capture word ordering and semantics, thus compromising text comprehension. Graph-of-words, a new text representation approach based on graph analytics, is a superior method overcoming the limitations by modeling word co-occurrence. In this study, we present a novel application of graph-of-words text description for automated extraction of histologic grade from unstructured pathology reports. Using 10-fold cross-validation tests, the proposed approach resulted in substantially higher macro and micro-F1 scores with undirected graph-of-words features, compared to traditional bi-gram text features. Our feasibility study demonstrated that graph-of-words is a highly efficient method of text comprehension for information extraction from free-text clinical documents.
AB - Traditional n-gram feature representation of freetext documents often fails to capture word ordering and semantics, thus compromising text comprehension. Graph-of-words, a new text representation approach based on graph analytics, is a superior method overcoming the limitations by modeling word co-occurrence. In this study, we present a novel application of graph-of-words text description for automated extraction of histologic grade from unstructured pathology reports. Using 10-fold cross-validation tests, the proposed approach resulted in substantially higher macro and micro-F1 scores with undirected graph-of-words features, compared to traditional bi-gram text features. Our feasibility study demonstrated that graph-of-words is a highly efficient method of text comprehension for information extraction from free-text clinical documents.
UR - http://www.scopus.com/inward/record.url?scp=85018453373&partnerID=8YFLogxK
U2 - 10.1109/BHI.2017.7897282
DO - 10.1109/BHI.2017.7897282
M3 - Conference contribution
AN - SCOPUS:85018453373
T3 - 2017 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017
SP - 369
EP - 372
BT - 2017 IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE EMBS International Conference on Biomedical and Health Informatics, BHI 2017
Y2 - 16 February 2017 through 19 February 2017
ER -