Abstract
Principal component analysis (PCA) is a statistical technique to identify the dependency structure of multivariate stochastic observations. PCA is frequently used in data mining applications. This paper considers PCA in the context of the emerging network-based computing environments. It offers a technique to perform PCA from distributed and heterogeneous data sets with relatively small communication overhead. The technique is evaluated against different data sets, including a data set for a web mining application. This approach is likely to facilitate the development of distributed clustering, associative link analysis, and other heterogeneous data mining applications that frequently use PCA.
Original language | English |
---|---|
Title of host publication | Principles of Data Mining and Knowledge Discovery - 4th European Conference, PKDD 2000, Proceedings |
Editors | Djamel A. Zighed, Jan Komorowski, Jan Zytkow |
Publisher | Springer Verlag |
Pages | 452-457 |
Number of pages | 6 |
ISBN (Print) | 9783540410669 |
DOIs | |
State | Published - 2000 |
Externally published | Yes |
Event | 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2000 - Lyon, France Duration: Sep 13 2000 → Sep 16 2000 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 1910 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 4th European Conference on Principles and Practice of Knowledge Discovery in Databases, PKDD 2000 |
---|---|
Country/Territory | France |
City | Lyon |
Period | 09/13/00 → 09/16/00 |
Funding
The authors thank American Cancer Society for supporting part of this research.