TY - GEN
T1 - A novel method to regenerate an optimal CNN by exploiting redundancy patterns in the network
AU - Pasupuleti, Sirish Kumar
AU - Miniskar, Narasinga Rao
AU - Rajagopal, Vasanthakumar
AU - Gadde, Raj Narayana
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Deploying Convolution Neural Networks (CNN) based computer vision applications on low-power embedded devices is challenging due to massive computation and memory bandwidth requirements. Research is on-going on faster algorithms, network pruning, and model compression techniques to produce light-weight networks. In this paper, we propose a novel method which exploits a redundancy pattern in the network to regenerate an efficient and functionally identical CNN for a given network. We identify the pattern based on the layer parameters (kernel size and stride) and data flow analysis among the layers to avoid the redundant processing and memory requirements while maintaining identical accuracy. Our proposed method augments the state-of-the-art pruning and model compression techniques to achieve further performance boost-up. The proposed method is experimented with the Caffe [1] framework for ResNet-50 [2] inference on Samsung smartphone with an octa-core ARM Cortex-A53 processor. The results show an improvement of 4x in performance and memory at layer level, ∼22% performance improvement and 6% memory reduction at network level.
AB - Deploying Convolution Neural Networks (CNN) based computer vision applications on low-power embedded devices is challenging due to massive computation and memory bandwidth requirements. Research is on-going on faster algorithms, network pruning, and model compression techniques to produce light-weight networks. In this paper, we propose a novel method which exploits a redundancy pattern in the network to regenerate an efficient and functionally identical CNN for a given network. We identify the pattern based on the layer parameters (kernel size and stride) and data flow analysis among the layers to avoid the redundant processing and memory requirements while maintaining identical accuracy. Our proposed method augments the state-of-the-art pruning and model compression techniques to achieve further performance boost-up. The proposed method is experimented with the Caffe [1] framework for ResNet-50 [2] inference on Samsung smartphone with an octa-core ARM Cortex-A53 processor. The results show an improvement of 4x in performance and memory at layer level, ∼22% performance improvement and 6% memory reduction at network level.
KW - Caffe
KW - Convolution Neural Networks
KW - Deep Neural Networks
KW - Light-weight network
KW - ResNet
UR - http://www.scopus.com/inward/record.url?scp=85045300219&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2017.8297115
DO - 10.1109/ICIP.2017.8297115
M3 - Conference contribution
AN - SCOPUS:85045300219
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 4407
EP - 4411
BT - 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PB - IEEE Computer Society
T2 - 24th IEEE International Conference on Image Processing, ICIP 2017
Y2 - 17 September 2017 through 20 September 2017
ER -