Partial data permutation for training deep neural networks

Guojing Cong, Li Zhang, Chih Chieh Yang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Random data permutation is considered as a best practice for training deep neural networks. When the input is large, permuting the full dataset is costly and limits scaling on distributed systems. Some practitioners use partial or no permutation that may potentially result in poor convergence.We propose a partitioned data permutation scheme as a low-cost alternative to full data permutation. Analyzing their entropy, we show that the two sampling schemes are asymptotically identical. We also show with minibatch SGD, both sampling schemes produce unbiased estimators of the true gradient. In addition, they have the same bound on the second moment of the gradient. Thus they have similar convergence properties. Our experiments confirm that SGD has similar training performance in practice with both sampling schemes.We further show that due to inherent randomness such as data augmentation and dropout in the training, even faster sampling schemes than partial permutation such as sequential sampling can achieve good performance. However, if no extra randomness is present in training, sampling schemes with low entropy can indeed degrade performance significantly.

Original languageEnglish
Title of host publicationProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020
EditorsLaurent Lefevre, Carlos A. Varela, George Pallis, Adel N. Toosi, Omer Rana, Rajkumar Buyya
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages728-735
Number of pages8
ISBN (Electronic)9781728160955
DOIs
StatePublished - May 2020
Externally publishedYes
Event20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020 - Melbourne, Australia
Duration: May 11 2020May 14 2020

Publication series

NameProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020

Conference

Conference20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020
Country/TerritoryAustralia
CityMelbourne
Period05/11/2005/14/20

Fingerprint

Dive into the research topics of 'Partial data permutation for training deep neural networks'. Together they form a unique fingerprint.

Cite this