TY - JOUR
T1 - The effect of class imbalance on case selection for case-based classifiers
T2 - An empirical study in the context of medical decision support
AU - Malof, Jordan M.
AU - Mazurowski, Maciej A.
AU - Tourassi, Georgia D.
PY - 2012/1
Y1 - 2012/1
N2 - Case selection is a useful approach for increasing the efficiency and performance of case-based classifiers. Multiple techniques have been designed to perform case selection. This paper empirically investigates how class imbalance in the available set of training cases can impact the performance of the resulting classifier as well as properties of the selected set. In this study, the experiments are performed using a dataset for the problem of detecting breast masses in screening mammograms. The classification problem was binary and we used a k-nearest neighbor classifier. The classifier's performance was evaluated using the Receiver Operating Characteristic (ROC) area under the curve (AUC) measure. The experimental results indicate that although class imbalance reduces the performance of the derived classifier and the effectiveness of selection at improving overall classifier performance, case selection can still be beneficial, regardless of the level of class imbalance.
AB - Case selection is a useful approach for increasing the efficiency and performance of case-based classifiers. Multiple techniques have been designed to perform case selection. This paper empirically investigates how class imbalance in the available set of training cases can impact the performance of the resulting classifier as well as properties of the selected set. In this study, the experiments are performed using a dataset for the problem of detecting breast masses in screening mammograms. The classification problem was binary and we used a k-nearest neighbor classifier. The classifier's performance was evaluated using the Receiver Operating Characteristic (ROC) area under the curve (AUC) measure. The experimental results indicate that although class imbalance reduces the performance of the derived classifier and the effectiveness of selection at improving overall classifier performance, case selection can still be beneficial, regardless of the level of class imbalance.
KW - Case selection
KW - Case-based reasoning
KW - Class imbalance
KW - Computer-aided decision
UR - http://www.scopus.com/inward/record.url?scp=82355169734&partnerID=8YFLogxK
U2 - 10.1016/j.neunet.2011.07.002
DO - 10.1016/j.neunet.2011.07.002
M3 - Article
C2 - 21820273
AN - SCOPUS:82355169734
SN - 0893-6080
VL - 25
SP - 141
EP - 145
JO - Neural Networks
JF - Neural Networks
ER -