TY - GEN
T1 - Database decomposition of a knowledge-based CAD system in mammography; An ensemble approach to improve detection
AU - Mazurowski, Maciej A.
AU - Zurada, Jacek M.
AU - Tourassi, Georgia D.
PY - 2008
Y1 - 2008
N2 - Although ensemble techniques have been investigated in supervised machine learning, their potential with knowledge-based systems is unexplored. The purpose of this study is to investigate the ensemble approach with a knowledge-based (KB) CAD system for the detection of masses in screening mammograms. The system is designed to determine the presence of a mass in a query mammographic region of interest (ROI) based on its similarity with previously acquired examples of mass and normal cases. Similarity between images is assessed using normalized mutual information. Two different approaches of knowledge database decomposition were investigated to create the ensemble. The first approach was random division of the knowledge database into a pre-specified number of equal size, separate groups. The second approach was based on k-means clustering of the knowledge cases according to common texture features extracted from the ROIs. The ensemble components were fused using a linear classifier. Based on a database of 1820 ROIs (901 masses and 919 and the leave-one-out crossvalidation scheme, the ensemble techniques improved the performance of the original KB-CAD system (Az = 0.86±0.01). Specifically, random division resulted in ROC area index of Az = 0.90 ± 0.01 while k-means clustering provided further improvement (A z = 0.91 ± 0.01). Although marginally better, the improvement was statistically significant. The superiority of the k-means clustering scheme was robust regardless of the number of clusters. This study supports the idea of incorporation of ensemble techniques with knowledge-based systems in mammography.
AB - Although ensemble techniques have been investigated in supervised machine learning, their potential with knowledge-based systems is unexplored. The purpose of this study is to investigate the ensemble approach with a knowledge-based (KB) CAD system for the detection of masses in screening mammograms. The system is designed to determine the presence of a mass in a query mammographic region of interest (ROI) based on its similarity with previously acquired examples of mass and normal cases. Similarity between images is assessed using normalized mutual information. Two different approaches of knowledge database decomposition were investigated to create the ensemble. The first approach was random division of the knowledge database into a pre-specified number of equal size, separate groups. The second approach was based on k-means clustering of the knowledge cases according to common texture features extracted from the ROIs. The ensemble components were fused using a linear classifier. Based on a database of 1820 ROIs (901 masses and 919 and the leave-one-out crossvalidation scheme, the ensemble techniques improved the performance of the original KB-CAD system (Az = 0.86±0.01). Specifically, random division resulted in ROC area index of Az = 0.90 ± 0.01 while k-means clustering provided further improvement (A z = 0.91 ± 0.01). Although marginally better, the improvement was statistically significant. The superiority of the k-means clustering scheme was robust regardless of the number of clusters. This study supports the idea of incorporation of ensemble techniques with knowledge-based systems in mammography.
KW - Classification and classifier design
KW - Database construction
KW - Detection
KW - Mammography
UR - http://www.scopus.com/inward/record.url?scp=44349166207&partnerID=8YFLogxK
U2 - 10.1117/12.771556
DO - 10.1117/12.771556
M3 - Conference contribution
AN - SCOPUS:44349166207
SN - 9780819470997
T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE
BT - Medical Imaging 2008 - Computer-Aided Diagnosis
T2 - Medical Imaging 2008 - Computer-Aided Diagnosis
Y2 - 19 February 2008 through 21 February 2008
ER -