Abstract
In this paper, we present work towards the development of a new data analytics and machine learning (ML) framework, called MagmaDNN. Our main goal is to provide scalable, high-performance data analytics and ML solutions for scientific applications running on current and upcoming heterogeneous many-core GPU-accelerated architectures. To this end, since many of the functionalities needed are based on standard linear algebra (LA) routines, we designed MagmaDNN to derive its performance power from the MAGMA library. The close integration provides the fundamental (scalable high-performance) LA routines available in MAGMA as a backend to MagmaDNN. We present some design issues for performance and scalability that are specific to ML using Deep Neural Networks (DNN), as well as the MagmaDNN designs towards overcoming them. In particular, MagmaDNN uses well established HPC techniques from the area of dense LA, including task-based parallelization, DAG representations, scheduling, mixed-precision algorithms, asynchronous solvers, and autotuned hyperparameter optimization. We illustrate these techniques and their incorporation and use to outperform other frameworks, currently available.
Original language | English |
---|---|
Title of host publication | High Performance Computing - ISC High Performance 2019 International Workshops, Revised Selected Papers |
Editors | Michèle Weiland, Guido Juckeland, Sadaf Alam, Heike Jagode |
Publisher | Springer |
Pages | 490-503 |
Number of pages | 14 |
ISBN (Print) | 9783030343552 |
DOIs | |
State | Published - 2019 |
Event | 34th International Conference on High Performance Computing, ISC High Performance 2019 - Frankfurt, Germany Duration: Jun 16 2019 → Jun 20 2019 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11887 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 34th International Conference on High Performance Computing, ISC High Performance 2019 |
---|---|
Country/Territory | Germany |
City | Frankfurt |
Period | 06/16/19 → 06/20/19 |
Funding
This work was conducted at the Joint Institute for Computational Sciences (JICS) and the Innovative Computing Laboratory (ICL). This work is sponsored by the National Science Foundation (NSF), through NSF REU Award #1659502, with additional Support from the University of Tennessee, Knoxville (UTK), the National Institute for Computational Sciences (NICS), and NSF Awards #1740250 and #1709069. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by NSF grant #ACI-1548562. Computational Resources are available through a XSEDE education allocation awards TG-ASC170031 and TG-ASC190013. In addition, the computing work was also performed on technical workstations donated by the BP High Performance Computing Team, as well as on GPUs donated by NVIDIA.
Keywords
- Data-driven scientific computing
- High-performance DNN
- Machine learning