Wootz: A compiler-based framework for fast CNN pruning via composability

Hui Guan, Xipeng Shen, Seung Hwan Lim

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Convolutional Neural Networks (CNN) are widely used for Deep Learning tasks. CNN pruning is an important method to adapt a large CNN model trained on general datasets to fit a more specialized task or a smaller device. The key challenge is on deciding which filters to remove in order to maximize the quality of the pruned networks while satisfying the constraints. It is time-consuming due to the enormous configuration space and the slowness of CNN training. The problem has drawn many efforts from the machine learning field, which try to reduce the set of network configurations to explore. This work tackles the problem distinctively from a programming systems perspective, trying to speed up the evaluations of the remaining configurations through computation reuse via a compiler-based framework. We empirically uncover the existence of composability in the training of a collection of pruned CNN models, and point out the opportunities for computation reuse. We then propose composability-based CNN pruning, and design a compression-based algorithm to efficiently identify the set of CNN layers to pre-train for maximizing their reuse benefits in CNN pruning. We further develop a compiler-based framework named Wootz, which, for an arbitrary CNN, automatically generates code that builds a Teacher-Student scheme to materialize composability-based pruning. Experiments show that network pruning enabled by Wootz shortens the state-of-art pruning process by up to 186X while producing significantly improved pruning results.

Original languageEnglish
Title of host publicationPLDI 2019 - Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation
EditorsKathryn S. McKinley, Kathleen Fisher
PublisherAssociation for Computing Machinery
Pages717-730
Number of pages14
ISBN (Electronic)9781450367127
DOIs
StatePublished - Jun 8 2019
Event40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019 - Phoenix, United States
Duration: Jun 22 2019Jun 26 2019

Publication series

NameProceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

Conference

Conference40th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2019
Country/TerritoryUnited States
CityPhoenix
Period06/22/1906/26/19

Funding

We thank Louisnoel Pouchet and the anonymous PLDI reviewers for the helpful comments. This material is based upon work supported by the National Science Foundation (NSF) under Grant No. CCF-1525609, CNS-1717425, CCF-1703487. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF. This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725.

FundersFunder number
National Science FoundationCCF-1703487, CCF-1525609, CNS-1717425
Battelle
Office of ScienceDE-AC05-00OR22725
Small Business Innovative Research and Small Business Technology Transfer

    Keywords

    • CNN
    • Compiler
    • Composability
    • Network pruning

    Fingerprint

    Dive into the research topics of 'Wootz: A compiler-based framework for fast CNN pruning via composability'. Together they form a unique fingerprint.

    Cite this