Accelerate distributed stochastic descent for nonconvex optimization with momentum

Guojing Cong, Tianyi Liu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Momentum method has been used extensively in optimizers for deep learning. Recent studies show that distributed training through K-step averaging has many nice properties. We propose a momentum method for such model averaging approaches. At each individual learner level traditional stochastic gradient is applied. At the meta-level (global learner level), one momentum term is applied and we call it block momentum. We analyze the convergence and scaling properties of such momentum methods. Our experimental results show that block momentum not only accelerates training, but also achieves better results.

Original languageEnglish
Title of host publicationProceedings of 2020 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, MLHPC 2020 and Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2020 - Held in conjunction with SC 2020
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages29-39
Number of pages11
ISBN (Electronic)9780738110783
DOIs
StatePublished - Nov 2020
Externally publishedYes
Event6th IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, MLHPC 2020 and 1st Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2020 - Virtual, Online, United States
Duration: Nov 12 2020 → …

Publication series

NameProceedings of 2020 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, MLHPC 2020 and Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2020 - Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference6th IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, MLHPC 2020 and 1st Workshop on Artificial Intelligence and Machine Learning for Scientific Applications, AI4S 2020
Country/TerritoryUnited States
CityVirtual, Online
Period11/12/20 → …

Fingerprint

Dive into the research topics of 'Accelerate distributed stochastic descent for nonconvex optimization with momentum'. Together they form a unique fingerprint.

Cite this