Low complex & high accuracy computation approximations to enable on-device RNN applications

Sirish Kumar Pasupuleti, Raj Narayana Gadde, Vasanthakumar Rajagopal, Ashok Vishnoi, N. Chandra Sekhar, R. Chandra Kumar, Narasinga Rao Miniskar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Recurrent Neural Networks (RNN) have demonstrated excellent results for various Automatic Speech Recognition (ASR) and Natural Language Processing (NLP) tasks. However, executing RNNs requires huge memory and computations which makes it difficult to achieve real time performance on low power devices like smartphones. Hence, currently ASR and NLP applications such as voice assistants are using cloud based solutions. In this paper, to enable on-device inference, we propose efficient approximations for weights of FC layers and activation functions to reduce the computational complexity. The proposed approximations eliminate multiplications, divisions and exponential operations by replacing them with simple arithmetic operations (shifts, additions) to significantly reduce the computation requirements without any perceivable loss of functional accuracy. The approximations also reduce the memory size and bandwidth requirements. We also present a lightweight VLIW based DSP architecture with these approximations to enable on-device inference. The approximations have been tested on the proposed DSP with various RNN applications like EESEN, LRCN and S2VT. The results with approximations show - accuracies similar to that of float (32-bit) reference, ∼ 8x-12x performance gains, ∼ 2x-4x gains in memory requirement and bandwidth. Moreover, the activation approximation results show better average and peak errors compared to the State of the Art.

Original languageEnglish
Title of host publication2019 IEEE International Symposium on Circuits and Systems, ISCAS 2019 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728103976
DOIs
StatePublished - 2019
Externally publishedYes
Event2019 IEEE International Symposium on Circuits and Systems, ISCAS 2019 - Sapporo, Japan
Duration: May 26 2019May 29 2019

Publication series

NameProceedings - IEEE International Symposium on Circuits and Systems
Volume2019-May
ISSN (Print)0271-4310

Conference

Conference2019 IEEE International Symposium on Circuits and Systems, ISCAS 2019
Country/TerritoryJapan
CitySapporo
Period05/26/1905/29/19

Keywords

  • On-Device Inference
  • Sigmoid TanH piece-wise approximations
  • Weights approximations as shifts

Fingerprint

Dive into the research topics of 'Low complex & high accuracy computation approximations to enable on-device RNN applications'. Together they form a unique fingerprint.

Cite this