Abstract
We present an efficient and scalable programming model for the development of linear algebra in heterogeneous multi-coprocessor environments. The model incorporates some of the current best design and implementation practices for the heterogeneous acceleration of dense linear algebra (DLA). Examples are given as the basis for solving linear systems’ algorithms – the LU, QR, and Cholesky factorizations. To generate the extreme level of parallelism needed for the efficient use of coprocessors, algorithms of interest are redesigned and then split into well-chosen computational tasks. The tasks execution is scheduled over the computational components of a hybrid system of multi-core CPUs and coprocessors using a light-weight runtime system. The use of lightweight runtime systems keeps scheduling overhead low, while enabling the expression of parallelism through otherwise sequential code. This simplifies the development efforts and allows the exploration of the unique strengths of the various hardware components.
Original language | English |
---|---|
Title of host publication | High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Revised Selected Papers |
Editors | Osni Marques, Michel Dayde, Kengo Nakajima |
Publisher | Springer Verlag |
Pages | 31-42 |
Number of pages | 12 |
ISBN (Print) | 9783319173528 |
DOIs | |
State | Published - 2015 |
Event | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 - Eugene, United States Duration: Jun 30 2014 → Jul 3 2014 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 8969 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 11th International Conference on High Performance Computing for Computational Science, VECPAR 2014 |
---|---|
Country/Territory | United States |
City | Eugene |
Period | 06/30/14 → 07/3/14 |
Bibliographical note
Publisher Copyright:© Springer International Publishing Switzerland 2015.
Funding
This research was partially supported by the National Science Foundation under Grants OCI-1032815, ACI-1339822, and Subcontract RA241-G1 on NSF Prime Grant OCI-0910735, DOE under Grants DE-SC0004983 and DE-SC0010042, and Intel. This research was supported in part by the National Science Foundation under Grants OCI-1032815, ACI-1339822, and Subcontract RA241-G1 on NSF Prime Grant OCI- 0910735, DOE under Grants DE-SC0004983 and DE-SC0010042, and Intel Corporation.
Funders | Funder number |
---|---|
National Science Foundation | RA241-G1 |
U.S. Department of Energy | |
National Science Foundation | OCI-1032815, OCI- 0910735, ACI-1339822 |
U.S. Department of Energy | DE-SC0004983, DE-SC0010042 |
Intel Corporation |