Distributed Bayesian optimization of deep reinforcement learning algorithms

M. Todd Young, Jacob D. Hinkle, Ramakrishnan Kannan, Arvind Ramanathan

Research output: Contribution to journalArticlepeer-review

28 Scopus citations

Abstract

Significant strides have been made in supervised learning settings thanks to the successful application of deep learning. Now, recent work has brought the techniques of deep learning to bear on sequential decision processes in the area of deep reinforcement learning (DRL). Currently, little is known regarding hyperparameter optimization for DRL algorithms. Given that DRL algorithms are computationally intensive to train, and are known to be sample inefficient, optimizing model hyperparameters for DRL presents significant challenges to established techniques. We provide an open source, distributed Bayesian model-based optimization algorithm, HyperSpace, and show that it consistently outperforms standard hyperparameter optimization techniques across three DRL algorithms.

Original languageEnglish
Pages (from-to)43-52
Number of pages10
JournalJournal of Parallel and Distributed Computing
Volume139
DOIs
StatePublished - May 2020

Funding

This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357, Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344, Los Alamos National Laboratory under Contract DE-AC5206NA25396, and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This work has been supported in part by the Joint Design of Advanced Computing Solutions for Cancer (JDACS4C) program established by the U.S. Department of Energy (DOE) and the National Cancer Institute (NCI) of the National Institutes of Health. This work was performed under the auspices of the U.S. Department of Energy by Argonne National Laboratory under Contract DE-AC02-06-CH11357 , Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 , Los Alamos National Laboratory under Contract DE-AC5206NA25396 , and Oak Ridge National Laboratory under Contract DE-AC05-00OR22725 . This research was supported by the Exascale Computing Project ( 17-SC-20-SC ), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory.

FundersFunder number
DOE Public Access Plan
U.S. Department of Energy Office of Science
United States Government
National Institutes of Health
U.S. Department of Energy
National Cancer Institute
National Nuclear Security Administration
Argonne National LaboratoryDE-AC02-06-CH11357
Lawrence Livermore National LaboratoryDE-AC52-07NA27344
Oak Ridge National Laboratory17-SC-20-SC, DE-AC05-00OR22725
Los Alamos National LaboratoryDE-AC5206NA25396

    Keywords

    • Bayesian optimization
    • Deep reinforcement learning

    Fingerprint

    Dive into the research topics of 'Distributed Bayesian optimization of deep reinforcement learning algorithms'. Together they form a unique fingerprint.

    Cite this