Abstract
In the dense nonsymmetric eigenvalue problem, work has focused on the Hessenberg reduction and QR iteration, using efficient algorithms and fast, Level 3 BLAS. Comparatively, computation of eigenvectors performs poorly, limited to slow, Level 2 BLAS performance with little speedup on multi-core systems. It has thus become a dominant cost in the solution of the eigenvalue problem. To address this, we present improvements for the eigenvector computation to use Level 3 BLAS and parallelize the triangular solves, achieving good parallel scaling and accelerating the overall eigenvalue problem more than three-fold.
Original language | English |
---|---|
Title of host publication | High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Revised Selected Papers |
Editors | Osni Marques, Michel Dayde, Kengo Nakajima |
Publisher | Springer Verlag |
Pages | 182-191 |
Number of pages | 10 |
ISBN (Print) | 9783319173528 |
DOIs | |
State | Published - 2015 |
Event | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 - Eugene, United States Duration: Jun 30 2014 → Jul 3 2014 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 8969 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 |
---|---|
Country/Territory | United States |
City | Eugene |
Period | 06/30/14 → 07/3/14 |
Funding
The results were obtained in part with the financial support of the Russian Scientific Fund, Agreement N14-11-00190; the National Science Foundation, U.S. Department of Energy, Intel, NVIDIA, and AMD.