Abstract
Convolutional Neural Networks (CNN) are widely used for Deep Learning tasks. CNN pruning is an important method to adapt a large CNN model trained on general datasets to fit a more specialized task or a smaller device. The key challenge is on deciding which filters to remove in order to maximize the quality of the pruned networks while satisfying the constraints. It is time-consuming due to the enormous configuration space and the slowness of CNN training. The problem has drawn many efforts from the machine learning field, which try to reduce the set of network configurations to explore. This work tackles the problem distinctively from a programming systems perspective, trying to speed up the evaluations of the remaining configurations through computation reuse via a compiler-based framework. We empirically uncover the existence of composability in the training of a collection of pruned CNN models, and point out the opportunities for computation reuse. We then propose composability-based CNN pruning, and design a compression-based algorithm to efficiently identify the set of CNN layers to pre-train for maximizing their reuse benefits in CNN pruning. We further develop a compiler-based framework named Wootz, which, for an arbitrary CNN, automatically generates code that builds a Teacher-Student scheme to materialize composability-based pruning. Experiments show that network pruning enabled by Wootz shortens the state-of-art pruning process by up to 186X while producing significantly improved pruning results.
Original language | English |
---|---|
Title of host publication | PLDI 2019 - Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation |
Editors | Kathryn S. McKinley, Kathleen Fisher |
Publisher | Association for Computing Machinery |
Pages | 717-730 |
Number of pages | 14 |
ISBN (Electronic) | 9781450367127 |
DOIs | |
State | Published - Jun 8 2019 |
Event | 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019 - Phoenix, United States Duration: Jun 22 2019 → Jun 26 2019 |
Publication series
Name | Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) |
---|
Conference
Conference | 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019 |
---|---|
Country/Territory | United States |
City | Phoenix |
Period | 06/22/19 → 06/26/19 |
Funding
We thank Louisnoel Pouchet and the anonymous PLDI reviewers for the helpful comments. This material is based upon work supported by the National Science Foundation (NSF) under Grant No. CCF-1525609, CNS-1717425, CCF-1703487. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.
Keywords
- CNN
- Compiler
- Composability
- Network pruning