Sampling algorithms to update truncated SVD

Ichitaro Yamazaki, Stanimire Tomov, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

A truncated singular value decomposition (SVD) is a powerful tool for analyzing modern datasets. However, the massive volume and rapidly changing nature of the datasets often make it too expensive to compute the SVD of the whole dataset at once. It is more attractive to use only a part of the dataset at a time and incrementally update the SVD. A randomized algorithm has been shown to be a great alternative to a traditional updating algorithm due to its ability to efficiently filter out the noises and extract the relevant features of the dataset. Though it is often faster than the traditional algorithm, in order to extract the relevant features, the randomized algorithm may need to accesses the data multiple times, and this data access creates a significant performance bottleneck. To improve the performance of the randomized algorithm for updating SVD, we study, in this paper, two sampling algorithms that access the data only two or three times, respectively. We present several case studies to show that only a small fraction of the data may be needed to maintain the quality of the updated SVD, while our performance results on a hybrid CPU/GPU computer demonstrate the potential of the sampling algorithms to improve the performance of the randomized algorithm.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
EditorsJian-Yun Nie, Zoran Obradovic, Toyotaro Suzumura, Rumi Ghosh, Raghunath Nambiar, Chonggang Wang, Hui Zang, Ricardo Baeza-Yates, Ricardo Baeza-Yates, Xiaohua Hu, Jeremy Kepner, Alfredo Cuzzocrea, Jian Tang, Masashi Toyoda
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages817-826
Number of pages10
ISBN (Electronic)9781538627143
DOIs
StatePublished - Jul 1 2017
Externally publishedYes
Event5th IEEE International Conference on Big Data, Big Data 2017 - Boston, United States
Duration: Dec 11 2017Dec 14 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Big Data, Big Data 2017
Volume2018-January

Conference

Conference5th IEEE International Conference on Big Data, Big Data 2017
Country/TerritoryUnited States
CityBoston
Period12/11/1712/14/17

Funding

This research was supported in part by the National Science Foundation (NSF) OAC Award number 1708299.

FundersFunder number
National Science Foundation1708299
Norsk Sykepleierforbund

    Keywords

    • out-of-core
    • randomize
    • sample
    • update SVD

    Fingerprint

    Dive into the research topics of 'Sampling algorithms to update truncated SVD'. Together they form a unique fingerprint.

    Cite this