Abstract
The solution of nonsymmetric eigenvalue problems, Ax=λ,r, can be accelerated substantially by first reducing A to an upper Hessenberg matrix H that has the same eigenvalues as A. This can be done using Householder orthogonal transformations, which is a well established standard, or stabilized elementary transformations. The latter approach, although having half the flops of the former, has been used less in practice, e.g., on computer architectures with well developed hierarchical memories, because of its memory-bound operations and the complexity in stabilizing it. In this paper we revisit the stabilized elementary transformations approach in the context of new architectures-both multicore CPUs and Xeon Phi coprocessors. We derive for a first time a blocking version of the algorithm. The blocked version reduces the memory-bound operations and we analyze its performance. A performance model is developed that shows the limitations of both approaches. The competitiveness of using stabilized elementary transformations has been quantified, highlighting that it can be 20 to 30% faster on current high-end multicore CPUs and Xeon Phi coprocessors.
Original language | English |
---|---|
Pages (from-to) | 135-142 |
Number of pages | 8 |
Journal | Simulation Series |
Volume | 47 |
Issue number | 4 |
State | Published - 2015 |
Event | 23rd High Performance Computing Symposium, HPC 2015, Part of the 2015 Spring Simulation Multi-Conference, SpringSim 2015 - Alexandria, United States Duration: Apr 12 2015 → Apr 15 2015 |
Funding
Funders | Funder number |
---|---|
National Science Foundation | ACI-1339822 |
Keywords
- Eigenvalues problem
- Hessenberg reduction
- Multi/many-core
- Stabilized elementary transformations