TY - GEN
T1 - Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster
AU - Yamazaki, Ichitaro
AU - Kurzak, Jakub
AU - Luszczek, Piotr
AU - Dongarra, Jack
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/11/15
Y1 - 2015/11/15
N2 - For data analysis, a partial singular value decomposition (SVD) of the sparse matrix representing the data is a powerful tool. However, computing the SVD of a large matrix can take a significant amount of time even on a current high-performance supercomputer. Hence, there is a growing interest in a novel algorithm that can quickly compute the SVD for efficiently processing massive amounts of data that are being generated from many modern applications. To respond to this demand, in this paper, we study randomized algorithms that update the SVD as changes are made to the data, which is often more efficient than recomputing the SVD from scratch. Furthermore, in some applications, recomputing the SVD may not be possible because the original data, for which the SVD has been already computed, is no longer available. Our experimental results with the data sets for the Latent Semantic Indexing and population clustering demonstrate that these randomized algorithms can obtain the desired accuracy of the SVD with a small number of data accesses, and compared to the state-of-the-art updating algorithm, they often require much lower computational and communication costs. Our performance results on a hybrid CPU/GPU cluster show that these randomized algorithms can obtain significant speedups over the state-of-the-art updating algorithm.
AB - For data analysis, a partial singular value decomposition (SVD) of the sparse matrix representing the data is a powerful tool. However, computing the SVD of a large matrix can take a significant amount of time even on a current high-performance supercomputer. Hence, there is a growing interest in a novel algorithm that can quickly compute the SVD for efficiently processing massive amounts of data that are being generated from many modern applications. To respond to this demand, in this paper, we study randomized algorithms that update the SVD as changes are made to the data, which is often more efficient than recomputing the SVD from scratch. Furthermore, in some applications, recomputing the SVD may not be possible because the original data, for which the SVD has been already computed, is no longer available. Our experimental results with the data sets for the Latent Semantic Indexing and population clustering demonstrate that these randomized algorithms can obtain the desired accuracy of the SVD with a small number of data accesses, and compared to the state-of-the-art updating algorithm, they often require much lower computational and communication costs. Our performance results on a hybrid CPU/GPU cluster show that these randomized algorithms can obtain significant speedups over the state-of-the-art updating algorithm.
UR - https://www.scopus.com/pages/publications/84966687201
U2 - 10.1145/2807591.2807608
DO - 10.1145/2807591.2807608
M3 - Conference contribution
AN - SCOPUS:84966687201
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2015
PB - IEEE Computer Society
T2 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
Y2 - 15 November 2015 through 20 November 2015
ER -