Convergence analysis for a nonlocal gradient descent method via directional Gaussian smoothing

Research output: Contribution to journalArticlepeer-review

Abstract

We analyze the convergence of a nonlocal gradient descent method for minimizing a class of high-dimensional non-convex functions, where a directional Gaussian smoothing (DGS) is proposed to define the nonlocal gradient (also referred to as the DGS gradient). The method was first proposed in [Zhang et al., Enabling long-range exploration in minimization of multimodal functions, UAI 2021], in which multiple numerical experiments showed that replacing the traditional local gradient with the DGS gradient can help the optimizers escape local minima more easily and significantly improve their performance. However, a rigorous theory for the efficiency of the method on nonconvex landscape is lacking. In this work, we investigate the scenario where the objective function is composed of a convex function, perturbed by deterministic oscillating noise. We provide a convergence theory under which the iterates exponentially converge to a tightened neighborhood of the solution, whose size is characterized by the noise wavelength. We also establish a correlation between the optimal values of the Gaussian smoothing radius and the noise wavelength, thus justifying the advantage of using moderate or large smoothing radii with the method. Furthermore, if the noise level decays to zero when approaching the global minimum, we prove that DGS-based optimization converges to the exact global minimum with linear rates, similarly to standard gradient-based methods in optimizing convex functions. Several numerical experiments are provided to confirm our theory and illustrate the superiority of the approach over those based on the local gradient.

Original languageEnglish
Pages (from-to)481-513
Number of pages33
JournalComputational Optimization and Applications
Volume90
Issue number2
DOIs
StatePublished - Mar 2025

Funding

We would like to thank Ting Kei Pong (The Hong Kong Polytechnic University) and Olena Burkovska (Oak Ridge National Laboratory) for their valuable feedback on the error of Gauss\u2013Hermite quadrature. This work was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program, under the contract ERKJ387, and accomplished at Oak Ridge National Laboratory (ORNL). ORNL is operated by UT-Battelle, LLC., for the U.S. Department of Energy under Contract DE-AC05-00OR22725. The research of Qiang Du is supported in part by DE-SC0022317, DE-SC0025347, and NSF DMS 2309245.

Fingerprint

Dive into the research topics of 'Convergence analysis for a nonlocal gradient descent method via directional Gaussian smoothing'. Together they form a unique fingerprint.

Cite this