Abstract
Integrated shared memory heterogeneous architectures are pervasive because they satisfy the diverse needs of mobile, autonomous,and edge computing platforms. Although specialized processingunits (PUs) that share a unified system memory improve performance and energy efficiency by reducing data movement, they alsoincrease contention for this memory since the PUs interact witheach other. Prior work has investigated performance degradationdue to memory contention, but few have studied the relationship ofpower and energy to memory contention. Moreover, a comprehensive solution that models memory contention for kernel placementon contemporary heterogeneous systems on chip (SoCs) in responseto energy and performance has been largely unaddressed.This paper presents MEPHESTO, a novel and holistic approachfor managing this balance. The authors characterize applicationsand PUs in terms of two memory contention factors-time factors and power factors-to achieve the desired trade-off betweenenergy and performance for collocated kernel execution on heterogeneous systems. The authors believe that this investigation isthe first to combine all of these factors and present a simple knobbased approach that expresses the target trade-off. The approach isevaluated on a diverse integrated shared memory heterogeneoussystem with a CPU, GPU, and programmable vision accelerator.By using an empirical model for memory contention that providesup to 92% accuracy, the kernel collocation approach can providea near-optimal ordering and placement based on the user-defined,energy-performance trade-off parameter. Moreover, the dynamicprogramming-based heuristics provide up to 30% better energyor 20% performance benefits when compared with the greedy approaches commonly employed by previous studies.
Original language | English |
---|---|
Title of host publication | PACT 2020 - Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 413-425 |
Number of pages | 13 |
ISBN (Electronic) | 9781450380751 |
DOIs | |
State | Published - Sep 30 2020 |
Event | 2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020 - Virtual, Online, United States Duration: Oct 3 2020 → Oct 7 2020 |
Publication series
Name | Parallel Architectures and Compilation Techniques - Conference Proceedings, PACT |
---|---|
ISSN (Print) | 1089-795X |
Conference
Conference | 2020 ACM International Conference on Parallel Architectures and Compilation Techniques, PACT 2020 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 10/3/20 → 10/7/20 |
Funding
This research was supported in part by the following sources: Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office (MTO) Domain-Specific System-on-Chip Program, the US Department of Energy (DOE) Advanced Scientific Computing Research (ASCR) program, and by an appointment to the Oak Ridge National Laboratory ASTRO Program, sponsored by DOE and administered by the Oak Ridge Institute for Science and Education. This manuscript has been co-authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan.
Keywords
- Energy-performance trade-off
- Heterogeneous systems
- Memory contention
- System on a Chip