Abstract
We present a novel distributed inference benchmarking system, called 'iBench', that provides relevant performance metrics for high-performance edge computing systems using trained deep learning models. The proposed benchmark is unique in that it includes data transfer performance through a distributed system, such as a supercomputer, using clients and servers to provide a system-level benchmark. iBench is flexible and robust enough to allow for the benchmarking of custom-built inference servers. This was demonstrated through the development of a custom Flask-based inference server to serve MLPerf's official ResNet50v1.5 model. In this paper, we compare iBench against MLPerf inference performance on an 8-V100 GPU node. iBench is shown to provide two primary advantages over MLPerf: (1) the ability to measure distributed inference performance, and (2) a more realistic measure of benchmark performance for inference servers on HPC by taking into account additional factors to inference time, such as HTTP request-response time, payload pre-processing and packing time, and invest time.
Original language | English |
---|---|
Title of host publication | 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
ISBN (Electronic) | 9781728192192 |
DOIs | |
State | Published - Sep 22 2020 |
Externally published | Yes |
Event | 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 - Virtual, Waltham, United States Duration: Sep 21 2020 → Sep 25 2020 |
Publication series
Name | 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 |
---|
Conference
Conference | 2020 IEEE High Performance Extreme Computing Conference, HPEC 2020 |
---|---|
Country/Territory | United States |
City | Virtual, Waltham |
Period | 09/21/20 → 09/25/20 |
Funding
This material is based upon work supported by, or in part by, the Department of Defense High Performance Computing Modernization Program (HPCMP) under User Productivity, Enhanced Technology Transfer, and Training (PET) contracts #GS04T09DBC0017 and #47QFSA18K0111. Any opinions, finding and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the DoD HPCMP.
Keywords
- GPU
- ResNet50
- TensorRT
- benchmark
- distributed
- inference