Abstract
One trend in modern high performance computing is to decompose a large linear algebra problem into thousands of small problems that can be solved independently. For this purpose we are developing a new BLAS standard (Batched BLAS), allowing users to perform thousands of small BLAS operations in parallel and making efficient use of their hardware. We discuss and introduce some details about how we are implementing this new scientific standard as well as some ideas about the upcoming processes that we plan to follow during its development.
Original language | English |
---|---|
Journal | CEUR Workshop Proceedings |
Volume | 1686 |
State | Published - 2016 |
Event | 4th Workshop on Sustainable Software for Science: Practice and Experiences, WSSSPE4 2016 - Manchester, United Kingdom Duration: Sep 12 2016 → Sep 14 2016 |
Keywords
- BLAS
- High performance computing
- Scientific computing