Abstract
In the dense nonsymmetric eigenvalue problem, work has focused on the Hessenberg reduction and QR iteration, using efficient algorithms and fast, Level 3 BLAS. Comparatively, computation of eigenvectors performs poorly, limited to slow, Level 2 BLAS performance with little speedup on multi-core systems. It has thus become a dominant cost in the solution of the eigenvalue problem. To address this, we present improvements for the eigenvector computation to use Level 3 BLAS and parallelize the triangular solves, achieving good parallel scaling and accelerating the overall eigenvalue problem more than three-fold.
| Original language | English |
|---|---|
| Title of host publication | High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Revised Selected Papers |
| Editors | Osni Marques, Michel Dayde, Kengo Nakajima |
| Publisher | Springer Verlag |
| Pages | 182-191 |
| Number of pages | 10 |
| ISBN (Print) | 9783319173528 |
| DOIs | |
| State | Published - 2015 |
| Event | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 - Eugene, United States Duration: Jun 30 2014 → Jul 3 2014 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 8969 |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 |
|---|---|
| Country/Territory | United States |
| City | Eugene |
| Period | 06/30/14 → 07/3/14 |
Funding
The results were obtained in part with the financial support of the Russian Scientific Fund, Agreement N14-11-00190; the National Science Foundation, U.S. Department of Energy, Intel, NVIDIA, and AMD.