Impact of low class prevalence on the performance evaluation of neural network based classifiers: Experimental study in the context of computer-assisted medical diagnosis

Maciej A. Mazurowski, Piotr A. Habas, Jacek M. Zurada, Georgia D. Tourassi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

This paper presents an experimental study on the impact of low class prevalence on the neural network based classifier performance as measured using Receiver Operator Characteristic (ROC) analysis. Two methods of dealing with the problem are investigated: oversampling and undersampling in the context of varying the class prevalence and the size of training datasets with uncorrelated and correlated features. The results show that the class imbalance can significantly decrease the classifier performance especially in the case of small training datasets. Furthermore, the oversampling method is shown to be more effective than the undersampling method in compensating the class imbalance. Statistically significant differences, however, are observed only in the cases with large total number of samples and very low prevalence.

Original languageEnglish
Title of host publicationThe 2007 International Joint Conference on Neural Networks, IJCNN 2007 Conference Proceedings
Pages2005-2009
Number of pages5
DOIs
StatePublished - 2007
Externally publishedYes
Event2007 International Joint Conference on Neural Networks, IJCNN 2007 - Orlando, FL, United States
Duration: Aug 12 2007Aug 17 2007

Publication series

NameIEEE International Conference on Neural Networks - Conference Proceedings
ISSN (Print)1098-7576

Conference

Conference2007 International Joint Conference on Neural Networks, IJCNN 2007
Country/TerritoryUnited States
CityOrlando, FL
Period08/12/0708/17/07

Fingerprint

Dive into the research topics of 'Impact of low class prevalence on the performance evaluation of neural network based classifiers: Experimental study in the context of computer-assisted medical diagnosis'. Together they form a unique fingerprint.

Cite this