Abstract
This article considers trends in heterogeneous system design, particularly for GPUs. Using the Keeneland Initial Delivery System, the authors examine the performance implications of increased parallelism and specialized hardware on parallel scientific applications. They examine how nonuniform data-transfer performance across the node-level topology can impact performance. Finally, they help users of GPU-based systems avoid performance problems related to this nonuniformity.
Original language | English |
---|---|
Article number | 5989784 |
Pages (from-to) | 66-75 |
Number of pages | 10 |
Journal | IEEE Micro |
Volume | 31 |
Issue number | 5 |
DOIs | |
State | Published - Sep 2011 |
Funding
The submitted manuscript has been auth ored by Oak Ridge National Laboratory, which is managed by UT-Battelle under contract DE-AC05-00OR22725 to the US government. Accordingly, the US government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for US government purposes. This research was sponsored in part by the Office of Advanced Scientific Computing Research in the US Department of Energy, the NSF award OCI-0910735, and DARPA contract HR0011-10-9-0008. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the US government.
Funders | Funder number |
---|---|
US Department of Energy | |
National Science Foundation | OCI-0910735 |
Defense Advanced Research Projects Agency | HR0011-10-9-0008 |
Advanced Scientific Computing Research |
Keywords
- GPU
- data-transfer performance
- heterogeneous GPUs
- nonuniformity