Abstract
Many applications-from data compression to numerical weather prediction and information retrieval-need to compute large dense singular value decompositions (SVD). When the problems are too large to fit into the computer's main memory, specialized out-of-core algorithms that use disk storage are required. A typical example is when trying to analyze a large data set through tools like MATLAB or Octave, but the data is just too large to be loaded. To overcome this, we designed a class of out-of-memory (OOM) algorithms to reduce, as well as overlap communication with computation. Of particular interest is OOM algorithms for matrices of size m × n, where m >> n or m << n, e.g., corresponding to cases of too many variables, or too many observations. To design OOM SVDs, we first study the communications cost for the SVD techniques as well as for the QR/LQ factorization followed by SVD. We present the theoretical analysis about the data movement cost and strategies to design OOM SVD algorithms. We show performance results for multicore architecture that illustrate our theoretical findings and match our performance models. Moreover, our experimental results show the feasibility and superiority of the OOM SVD.
Original language | English |
---|---|
Title of host publication | 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781538634721 |
DOIs | |
State | Published - Oct 30 2017 |
Event | 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 - Waltham, United States Duration: Sep 12 2017 → Sep 14 2017 |
Publication series
Name | 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 |
---|
Conference
Conference | 2017 IEEE High Performance Extreme Computing Conference, HPEC 2017 |
---|---|
Country/Territory | United States |
City | Waltham |
Period | 09/12/17 → 09/14/17 |
Funding
This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The work was also partially supported by Nvidia and NSF under grant No. 1514406.