Diagnosis and optimization of application prefetching performance

Gabriel Marin, Collin McCurdy, Jeffrey S. Vetter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

Hardware prefetchers are effective at recognizing streaming memory access patterns and at moving data closer to the processing units to hide memory latency. However, hardware prefetchers can track only a limited number of data streams due to finite hardware resources. In this paper, we introduce the term streaming concurrency to characterize the number of parallel, logical data streams in an application. We present a simulation algorithm for understanding the streaming concurrency at any point in an application, and we show that this metric is a good predictor of the number of memory requests initiated by streaming prefetchers. Next, we try to understand the causes behind poor prefetching performance. We identified four prefetch unfriendly conditions and we show how to classify an application's memory references based on these conditions. We evaluated our analysis using the SPEC CPU2006 benchmark suite. We selected two benchmarks with unfavorable access patterns and transformed them to improve their prefetching effectiveness. Results show that making applications more prefetcher friendly can yield meaningful performance gains.

Original languageEnglish
Title of host publicationICS 2013 - Proceedings of the 2013 ACM International Conference on Supercomputing
Pages303-312
Number of pages10
DOIs
StatePublished - 2013
Event27th ACM International Conference on Supercomputing, ICS 2013 - Eugene, OR, United States
Duration: Jun 10 2013Jun 14 2013

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference27th ACM International Conference on Supercomputing, ICS 2013
Country/TerritoryUnited States
CityEugene, OR
Period06/10/1306/14/13

Keywords

  • diagnosis
  • performance modeling
  • stream prefetching

Fingerprint

Dive into the research topics of 'Diagnosis and optimization of application prefetching performance'. Together they form a unique fingerprint.

Cite this