Abstract
The purpose of this study was to identify and characterize clusters in a heterogeneous breast cancer computer-aided diagnosis database. Identification of subgroups within the database could help elucidate clinical trends and facilitate future model building. Agglomerative hierarchical clustering and k-means clustering were used to identify clusters in a large, heterogeneous computer-aided diagnosis database based on mammographic findings (BI-RADS™) and patient age. The clusters were examined in terms of their feature distributions. The clusters showed logical separation of distinct clinical subtypes such as architectural distortions, masses, and calcifications. Moreover, the common subtypes of masses and calcifications were stratified into clusters based on age groupings. The percent of the cases that were malignant was notably different among the clusters. Cluster analysis can provide a powerful tool in discerning the subgroups present in a large, heterogeneous computer-aided diagnosis database.
Original language | English |
---|---|
Pages (from-to) | 363-370 |
Number of pages | 8 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 4684 I |
DOIs | |
State | Published - 2002 |
Externally published | Yes |
Event | Medical Imaging 2002: Image Processing - San Diego, CA, United States Duration: Feb 24 2002 → Feb 28 2002 |
Keywords
- Agglomerative hierarchical clustering
- Breast cancer
- Computer-aided diagnosis
- Data mining
- Unsupervised learning
- k-means