A scalable algorithm for the optimization of neural network architectures

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

We propose a new scalable method to optimize the architecture of an artificial neural network. The proposed algorithm, called Greedy Search for Neural Network Architecture, aims to determine a neural network with minimal number of layers that is at least as performant as neural networks of the same structure identified by other hyperparameter search algorithms in terms of accuracy and computational cost. Numerical results performed on benchmark datasets show that, for these datasets, our method outperforms state-of-the-art hyperparameter optimization algorithms in terms of attainable predictive performance by the selected neural network architecture, and time-to-solution for the hyperparameter optimization to complete.

Original languageEnglish
Article number102788
JournalParallel Computing
Volume104-105
DOIs
StatePublished - Jul 2021

Funding

This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).Massimiliano Lupo Pasini thanks Dr. Vladimir Protopopescu for his valuable feedback in the preparation of this manuscript and three anonymous reviewers for their very useful comments and suggestions. This work is supported in part by the Office of Science of the US Department of Energy (DOE) and by the LDRD Program of Oak Ridge National Laboratory, USA. This work used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. Y. W. Li was partly supported by the LDRD Program of Los Alamos National Laboratory (LANL), USA under project number 20190005DR. LANL is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001). This document number is LA-UR-21-20936. This work is supported in part by the Office of Science of the US Department of Energy (DOE) and by the LDRD Program of Oak Ridge National Laboratory, USA . This work used resources of the Oak Ridge Leadership Computing Facility (OLCF), which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. Y. W. Li was partly supported by the LDRD Program of Los Alamos National Laboratory (LANL), USA under project number 20190005DR . LANL is operated by Triad National Security, LLC, for the National Nuclear Security Administration of U.S. Department of Energy (Contract No. 89233218CNA000001). This document number is LA-UR-21-20936.

FundersFunder number
DOE Public Access Plan
U.S. Department of Energy89233218CNA000001, LA-UR-21-20936
Office of ScienceDE-AC05-00OR22725
National Nuclear Security Administration
Oak Ridge National Laboratory
Laboratory Directed Research and Development
Los Alamos National Laboratory20190005DR

    Keywords

    • Adaptive algorithms
    • Deep learning
    • Greedy constructive algorithms
    • Hyperparameter optimization
    • Neural network architecture
    • Random search

    Fingerprint

    Dive into the research topics of 'A scalable algorithm for the optimization of neural network architectures'. Together they form a unique fingerprint.

    Cite this