Mixed-tool performance analysis on hybrid multicore architectures

Peng Du, Piotr Luszczek, Stanimire Tomov, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

This paper proposes a triangular solve algorithm with variable block size for graphics processing unit (GPU). By using diagonal blocks inversion with recursion, this algorithm works with tunable block size to achieve the best performance. Various methods are shown on how to make use of existing profiling tools to successfully measure and analyze performance of this algorithm.We use some of the most popular CPU and GPU profiling tools for their advantages and overcome their disadvantages with several new techniques to analyze the performance and relationship of different components of applications. With the presented methodologies, insight information is produced which helps to understand and tune the proposed algorithm and considerably improve the performance of the solver itself as well as the application using it.

Original languageEnglish
Title of host publicationProceedings - 2010 39th International Conference on Parallel Processing Workshops, ICPPW 2010
Pages236-244
Number of pages9
DOIs
StatePublished - 2010
Event2010 39th International Conference on Parallel Processing Workshops, ICPPW 2010 - San Diego, CA, United States
Duration: Sep 13 2010Sep 16 2010

Publication series

NameProceedings of the International Conference on Parallel Processing Workshops
ISSN (Print)1530-2016

Conference

Conference2010 39th International Conference on Parallel Processing Workshops, ICPPW 2010
Country/TerritoryUnited States
CitySan Diego, CA
Period09/13/1009/16/10

Fingerprint

Dive into the research topics of 'Mixed-tool performance analysis on hybrid multicore architectures'. Together they form a unique fingerprint.

Cite this