Porting DMRG++ Scientific Application to OpenPOWER

Arghya Chatterjee, Gonzalo Alvarez, Eduardo D’Azevedo, Wael Elwasif, Oscar Hernandez, Vivek Sarkar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

With the rapidly changing microprocessor designs and architectural diversity (multi-cores, many-cores, accelerators) for the next generation HPC systems, scientific applications must adapt to the hardware, to exploit the different types of parallelism and resources available in the architecture. To get the benefit of all the in-node hardware threads, it is important to use a single programming model to map and coordinate the available work to the different heterogeneous execution units in the node (e.g., multi-core hardware threads (latency optimized), accelerators (bandwidth optimized), etc.). Our goal is to show that we can manage the node complexity of these systems by using OpenMP for in-node parallelization by exploiting different “programming styles” supported by OpenMP 4.5 to program CPU cores and accelerators. Finding out the suitable programming-style (e.g., SPMD style, multi-level tasks, accelerator programming, nested parallelism, or a combination of these) using the latest features of OpenMP to maximize performance and achieve performance portability across heterogeneous and homogeneous systems is still an open research problem. We developed a mini-application, Kronecker Product (KP), from the original DMRG++ application (sparse matrix algebra) computational motif to experiment with different OpenMP programming styles on an OpenPOWER architecture and present their results in this paper.

Original languageEnglish
Title of host publicationHigh Performance Computing - ISC High Performance 2018 International Workshops, Revised Selected Papers
EditorsJohn Shalf, Sadaf Alam, Rio Yokota, Michèle Weiland
PublisherSpringer Verlag
Pages418-431
Number of pages14
ISBN (Print)9783030024642
DOIs
StatePublished - 2018
EventInternational Conference on High Performance Computing, ISC High Performance 2018 - Frankfurt, Germany
Duration: Jun 28 2018Jun 28 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11203 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on High Performance Computing, ISC High Performance 2018
Country/TerritoryGermany
CityFrankfurt
Period06/28/1806/28/18

Funding

Acknowledgment. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Research sponsored by the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy. We developed a mini-application, Kronecker Product (KP), from the original DMRG++ application (sparse matrix algebra) computational G. Alvarez—Author contribution consisted in explaining the DMRG algorithm and its implementation, and not in the OpenMP use and evaluation. This manuscript has been co-authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE (http://energy.gov/downloads/doe-public-access-plan).

FundersFunder number
US Department of Energy
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science
Oak Ridge National Laboratory

    Keywords

    • Data parallelism
    • Nested parallelism
    • OpenMP
    • OpenMP 4.5
    • Power8
    • Task parallelism

    Fingerprint

    Dive into the research topics of 'Porting DMRG++ Scientific Application to OpenPOWER'. Together they form a unique fingerprint.

    Cite this