Ramifications of Evolving Misbehaving Convolutional Neural Network Kernel and Batch Sizes

Mark Coletti, Dalton Lunga, Anne Berres, Jibonananda Sanyal, Amy Rose

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Deep-learners have many hyper-parameters including learning rate, batch size, kernel size - all playing a significant role toward estimating high quality models. Discovering useful hyper-parameter guidelines is an active area of research, though the state of the art generally uses a brute force, uniform grid approach or random search for finding ideal settings. We share the preliminary results of using an alternative approach to deep learner hyper-parameter tuning that uses an evolutionary algorithm to improve the accuracy of a deep-learner models used in satellite imagery building footprint detection. We found that the kernel and batch size hyper-parameters surprisingly differed from sizes arrived at via a brute force uniform grid approach. These differences suggest a novel role for evolutionary algorithms in determining the number of convolution layers, as well as smaller batch sizes in improving deep-learner models.

Original languageEnglish
Title of host publicationProceedings of MLHPC 2018
Subtitle of host publicationMachine Learning in HPC Environments, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages106-113
Number of pages8
ISBN (Electronic)9781728101804
DOIs
StatePublished - Jul 2 2018
Event2018 IEEE/ACM Machine Learning in HPC Environments, MLHPC 2018 - Dallas, United States
Duration: Nov 12 2018 → …

Publication series

NameProceedings of MLHPC 2018: Machine Learning in HPC Environments, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2018 IEEE/ACM Machine Learning in HPC Environments, MLHPC 2018
Country/TerritoryUnited States
CityDallas
Period11/12/18 → …

Funding

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). VI. ACKNOWLEDGEMENTS Thanks to Jeanette Weaver of ORNL for supplying the building detection image used for Fig. 1. Thanks to Dalton Lunga, also of ORNL, for supplying the training and validation data as well as the original architecture with attendant source code. We would also like to thank the ASCR Leadership Computing Challenge (ALCC) for the 25 million core hours on Titan that made this research possible. This research was funded by the Federal Emergency Management Association (FEMA), the Bill and Melinda Gates Foundation, and via undisclosed federal funding sources.

FundersFunder number
U.S. Department of Energy
Bill and Melinda Gates Foundation

    Keywords

    • Convolutional neural networks
    • Deep learning
    • Evolutionary algorithms
    • Hyper-parameters
    • Optimization
    • Satellite imagery
    • Settlement detection

    Fingerprint

    Dive into the research topics of 'Ramifications of Evolving Misbehaving Convolutional Neural Network Kernel and Batch Sizes'. Together they form a unique fingerprint.

    Cite this