On the design, development, and analysis of optimized matrix-vector multiplication routines for coprocessors

Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack Dongarra

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations

Abstract

The manycore paradigm shift, and the resulting change in modern computer architectures, has made the development of optimal numerical routines extremely challenging. In this work, we target the development of numerical algorithms and implementations for Xeon Phi coprocessor architecture designs. In particular, we examine and optimize the general and symmetric matrix-vector multiplication routines (gemv/symv), which are some of the most heavily used linear algebra kernels in many important engineering and physics applications. We describe a successful approach on how to address the challenges for this problem, starting with our algorithm design, performance analysis and programing model and moving to kernel optimization. Our goal, by targeting low-level and easy to understand fundamental kernels, is to develop new optimization strategies that can be effective elsewhere for use on manycore coprocessors, and to show significant performance improvements compared to existing state-of-the-art implementations. Therefore, in addition to the new optimization strategies, analysis, and optimal performance results, we finally present the significance of using these routines/strategies to accelerate higher-level numerical algorithms for the eigenvalue problem (EVP) and the singular value decomposition (SVD) that by themselves are foundational for many important applications.

Original languageEnglish
Article numberA5
Pages (from-to)58-73
Number of pages16
JournalLecture Notes in Computer Science
Volume9137 LNCS
DOIs
StatePublished - 2015
Event30th International Conference on High Performance Computing, ISC 2015 - Frankfurt, Germany
Duration: Jul 12 2015Jul 16 2015

Funding

This material is based upon work supported by the National Science Foundation under Grant No. ACI-1339822, the Department of Energy, and Intel. The results were obtained in part with the financial support of the Russian Scientific Fund, Agreement N14-11-00190.

FundersFunder number
Department of Energy, and Intel
Russian Scientific FundN14-11-00190
National Science FoundationACI-1339822

    Fingerprint

    Dive into the research topics of 'On the design, development, and analysis of optimized matrix-vector multiplication routines for coprocessors'. Together they form a unique fingerprint.

    Cite this