TY - GEN
T1 - Characterizing mammography reports for health analytics
AU - Rojas, Carlos
AU - Patton, Robert
AU - Beckerman, Barbara
PY - 2010
Y1 - 2010
N2 - As massive collections of digital health data are becoming available, the opportunities for large scale automated analysis increase. In particular, the widespread collection of detailed health information is expected to help realize a vision of evidence-based public health and patient-centric health care. Within such a framework for large scale health analytics we describe several methods to characterize and analyze free-text mammography reports, including their temporal dimension, using information retrieval, supervised learning, and classical statistical techniques. We present experimental results with a large collection of mostly unlabeled reports that demonstrate the validity and usefulness of the approach, since these results are consistent with the known features of the data and provide novel insights about it.
AB - As massive collections of digital health data are becoming available, the opportunities for large scale automated analysis increase. In particular, the widespread collection of detailed health information is expected to help realize a vision of evidence-based public health and patient-centric health care. Within such a framework for large scale health analytics we describe several methods to characterize and analyze free-text mammography reports, including their temporal dimension, using information retrieval, supervised learning, and classical statistical techniques. We present experimental results with a large collection of mostly unlabeled reports that demonstrate the validity and usefulness of the approach, since these results are consistent with the known features of the data and provide novel insights about it.
KW - clinical notes
KW - electronic health records
KW - temporal analysis
KW - text analysis
UR - http://www.scopus.com/inward/record.url?scp=78650933809&partnerID=8YFLogxK
U2 - 10.1145/1882992.1883022
DO - 10.1145/1882992.1883022
M3 - Conference contribution
AN - SCOPUS:78650933809
SN - 9781450300308
T3 - IHI'10 - Proceedings of the 1st ACM International Health Informatics Symposium
SP - 201
EP - 209
BT - IHI'10 - Proceedings of the 1st ACM International Health Informatics Symposium
T2 - 1st ACM International Health Informatics Symposium, IHI'10
Y2 - 11 November 2010 through 12 November 2010
ER -