Towards an auto-tuned and task-based spmv (lass library)

Sandra Catalán, Tetsuzo Usui, Leonel Toledo, Xavier Martorell, Jesús Labarta, Pedro Valero-Lara

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

We present a novel approach to parallelize the SpMV kernel included in LASs (Linear Algebra routines on OmpSs) library, after a deep review and analysis of several well-known approaches. LASs is based on OmpSs, a task-based runtime that extends OpenMP directives, providing more flexibility to apply new strategies. Based on tasking and nesting, with the aim of improving the workload imbalance inherent to the SpMV operation, we present a strategy especially useful for highly imbalanced input matrices. In this approach, the number of created tasks is dynamically decided in order to maximize the use of the resources of the platform. Throughout this paper, SpMV behavior depending on the selected strategy (state of the art and proposed strategies) is deeply analyzed, setting in this way the base for a future auto-tunable code that is able to select the most suitable approach depending on the input matrix. The experiments of this work were carried out for a set of 12 matrices from the Suite Sparse Matrix Collection, all of them with different characteristics regarding their sparsity. The experiments of this work were performed on a node of Marenostrum 4 supercomputer (with two sockets Intel Xeon, 24 cores each) and on a node of Dibona cluster (using one ARM ThunderX2 socket with 32 cores). Our tests show that, for Intel Xeon, the best parallelization strategy reduces the execution time of the reference MKL multi-threaded version up to 67%. On ARM ThunderX2, the reduction is up to 56% with respect to the OmpSs parallel reference.

Original languageEnglish
Title of host publicationOpenMP
Subtitle of host publicationPortable Multi-Level Parallelism on Modern Systems - 16th International Workshop on OpenMP, IWOMP 2020, Proceedings
EditorsKent Milfeld, Lars Koesterke, Bronis R. de Supinski, Jannis Klinkenberg
PublisherSpringer Science and Business Media Deutschland GmbH
Pages115-129
Number of pages15
ISBN (Print)9783030581435
DOIs
StatePublished - 2020
Externally publishedYes
Event16th International Workshop on OpenMP, IWOMP 2020 - Austin, United States
Duration: Sep 22 2020Sep 24 2020

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12295 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Workshop on OpenMP, IWOMP 2020
Country/TerritoryUnited States
CityAustin
Period09/22/2009/24/20

Funding

This project has received funding from the Spanish Ministry of Economy and Competitiveness under the project Computación de Altas Prestaciones VII (TIN2015-65316-P), the Departament d’Innovació, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programació i Entorns d’Execució Parallels (2014-SGR-1051), and the Juan de la Cierva Grant Agreement No IJCI-2017-33511, and the Spanish Ministry of Science and Innovation under the project Heterogeneidad y especialización en la era post-Moore (RTI2018-093684-B-I00). We also acknowledge the funding provided by Fujitsu under the BSC-Fujitsu joint project: Math Libraries Migration and Optimization. This project has received funding from the Spanish Ministry of Economy and Competitiveness under the project Computaci?n de Altas Prestaciones VII (TIN2015-65316-P), the Departament d?Innovaci?, Universitats i Empresa de la Generalitat de Catalunya, under project MPEXPAR: Models de Programaci? i Entorns d?Execuci? Parallels (2014-SGR-1051), and the Juan de la Cierva Grant Agreement No IJCI-2017-33511, and the Spanish Ministry of Science and Innovation under the project Heterogeneidad y especializaci?n en la era post-Moore (RTI2018-093684-B-I00). We also acknowledge the funding provided by Fujitsu under the BSC-Fujitsu joint project: Math Libraries Migration and Optimization.

FundersFunder number
Computaci?n de Altas Prestaciones VII
Computación de Altas Prestaciones VIITIN2015-65316-P
Spanish Ministry of Economy and Competitiveness
Generalitat de CatalunyaIJCI-2017-33511, 2014-SGR-1051
Fujitsu
Ministerio de Ciencia e InnovaciónRTI2018-093684-B-I00

    Keywords

    • Auto-tuning
    • LASs
    • Nesting
    • OmpSs
    • Parallel programming
    • SpMV
    • Tasking
    • Taskloop

    Fingerprint

    Dive into the research topics of 'Towards an auto-tuned and task-based spmv (lass library)'. Together they form a unique fingerprint.

    Cite this