TY - GEN
T1 - Document clustering using particle swarm optimization
AU - Cui, Xiaohui
AU - Potok, Thomas E.
AU - Palathingal, Paul
PY - 2005
Y1 - 2005
N2 - Fast and high-quality document clustering algorithms play an important role in effectively navigating, summarizing, and organizing information. Recent studies have shown that partitional clustering algorithms are more suitable for clustering large datasets. However, the K-means algorithm, the most commonly used partitional clustering algorithm, can only generate a local optimal solution. In this paper, we present a Particle Swarm Optimization (PSO) document clustering algorithm. Contrary to the localized searching of the K-means algorithm, the PSO clustering algorithm performs a globalized search in the entire solution space. In the experiments we conducted, we applied the PSO, K-means and hybrid PSO clustering algorithm on four different text document datasets. The number of documents in the datasets ranges from 204 to over 800, and the number of terms ranges from over 5000 to over 7000. The results illustrate that the hybrid PSO algorithm can generate more compact clustering results than the K-means algorithm.
AB - Fast and high-quality document clustering algorithms play an important role in effectively navigating, summarizing, and organizing information. Recent studies have shown that partitional clustering algorithms are more suitable for clustering large datasets. However, the K-means algorithm, the most commonly used partitional clustering algorithm, can only generate a local optimal solution. In this paper, we present a Particle Swarm Optimization (PSO) document clustering algorithm. Contrary to the localized searching of the K-means algorithm, the PSO clustering algorithm performs a globalized search in the entire solution space. In the experiments we conducted, we applied the PSO, K-means and hybrid PSO clustering algorithm on four different text document datasets. The number of documents in the datasets ranges from 204 to over 800, and the number of terms ranges from over 5000 to over 7000. The results illustrate that the hybrid PSO algorithm can generate more compact clustering results than the K-means algorithm.
UR - http://www.scopus.com/inward/record.url?scp=33745786081&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33745786081
SN - 0780389166
SN - 9780780389168
T3 - Proceedings - 2005 IEEE Swarm Intelligence Symposium, SIS 2005
SP - 191
EP - 197
BT - Proceedings - 2005 IEEE Swarm Intelligence Symposium, SIS 2005
T2 - 2005 IEEE Swarm Intelligence Symposium, SIS 2005
Y2 - 8 June 2005 through 10 June 2005
ER -