Accurate and Efficient Fixed Point Inference for Deep Neural Networks

Vasanthakumar Rajagopal, Chandra Kumar Ramasamy, Ashok Vishnoi, Raj Narayana Gadde, Narasinga Rao Miniskar, Sirish Kumar Pasupuleti

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Deploying DNNs on embedded devices is a challenge because of their high memory and computational requirements. Performing DNN inference in lesser bit-width fixed point arithmetic is seen as a crucial step in realizing DNNs on embedded devices. State-of-the-art methods achieve floating point accuracy using re-training and complex activation normalization methods. In this paper we propose an accurate and efficient end-to-end DNN inference on 16-bit fixed point arithmetic. We prove that floating point accuracy can be achieved with a simple quantization method of using powers of 2 as scale factors coupled with our optimal bit-width estimation algorithm without using re-training. Additionally, it leads to efficient activation normalization using only arithmetic shifts. We show that the combination of our quantization method and activation normalization maximizes SIMD throughput resulting in 2x to 6x gain in execution time compared to floating point inference. Experimental results demonstrate that our method generalizes to different networks giving same or better accuracy compared to floating point for classification, regression and recurrent networks.

Original languageEnglish
Title of host publication2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings
PublisherIEEE Computer Society
Pages1847-1851
Number of pages5
ISBN (Electronic)9781479970612
DOIs
StatePublished - Aug 29 2018
Externally publishedYes
Event25th IEEE International Conference on Image Processing, ICIP 2018 - Athens, Greece
Duration: Oct 7 2018Oct 10 2018

Publication series

NameProceedings - International Conference on Image Processing, ICIP
ISSN (Print)1522-4880

Conference

Conference25th IEEE International Conference on Image Processing, ICIP 2018
Country/TerritoryGreece
CityAthens
Period10/7/1810/10/18

Keywords

  • Deep Neural Networks
  • Fixed Point Arithmetic
  • Inference
  • Quantization

Fingerprint

Dive into the research topics of 'Accurate and Efficient Fixed Point Inference for Deep Neural Networks'. Together they form a unique fingerprint.

Cite this