Nonconvex regularization for sparse neural networks

Konstantin Pieper, Armenak Petrosyan

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Convex ℓ1 regularization using an infinite dictionary of neurons has been suggested for constructing neural networks with desired approximation guarantees, but can be affected by an arbitrary amount of over-parametrization. This can lead to a loss of sparsity and result in networks with too many active neurons for the given data, in particular if the number of data samples is large. As a remedy, in this paper, a nonconvex regularization method is investigated in the context of shallow ReLU networks: We prove that in contrast to the convex approach, any resulting (locally optimal) network is finite even in the presence of infinite data (i.e., if the data distribution is known and the limiting case of infinite samples is considered). Moreover, we show that approximation guarantees and existing bounds on the network size for finite data are maintained.

Original languageEnglish
Pages (from-to)25-56
Number of pages32
JournalApplied and Computational Harmonic Analysis
Volume61
DOIs
StatePublished - Nov 2022

Funding

The material in this manuscript is based on work supported by the Laboratory Directed Research and Development Program at Oak Ridge National Laboratory (ORNL), managed by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 and by the U.S. Department of Energy , Office of Science, Early Career Research Program under award number ERKJ314 . The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ). The material in this manuscript is based on work supported by the Laboratory Directed Research and Development Program at Oak Ridge National Laboratory (ORNL), managed by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 and by the U.S. Department of Energy, Office of Science, Early Career Research Program under award number ERKJ314. The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Keywords

  • Neural network
  • Non convex
  • Sparsity

Fingerprint

Dive into the research topics of 'Nonconvex regularization for sparse neural networks'. Together they form a unique fingerprint.

Cite this