Eliminating Polylogarithmic Factor in O(epsilon^(-7/4)) Complexity

Improved Complexity for Restarted Nonconvex Accelerated Gradient Descent

By Huan Li and Zhouchen Lin; 24(157):1−37, 2023.

Abstract

This paper focuses on accelerated gradient methods for nonconvex optimization problems with Lipschitz continuous gradient and Hessian. We introduce two simple accelerated gradient methods, namely restarted accelerated gradient descent (AGD) and restarted heavy ball (HB) method. We prove that our methods can achieve an $\epsilon$-approximate first-order stationary point within $O(\epsilon^{-7/4})$ number of gradient evaluations using elementary proofs. Our complexity analysis does not include any polylogarithmic factors, making it an improvement over the existing best-known complexity by a factor of $O(\log \frac{1}{\epsilon})$. Our algorithms are simple as they only rely on Nesterov’s classical AGD or Polyak’s HB iterations, along with a restart mechanism. They do not require negative curvature exploitation or minimization of regularized surrogate functions as subroutines. In contrast to previous analyses, our elementary proofs employ less advanced techniques and do not rely on the analysis of strongly convex AGD or HB.

[abs]

[pdf][bib]
[code]

Improved Complexity for Restarted Nonconvex Accelerated Gradient Descent

Abstract

Trending now