Preconditioned Gradient Descent for Overparameterized Nonconvex Burer–Monteiro Factorization with Global Optimality Certification

Gavin Zhang, Salar Fattahi, Richard Y. Zhang; 24(163):1−55, 2023.

Abstract

The objective of this research is to minimize the nonconvex function $f(X)=\phi(XX^{T})$ using gradient descent. The function is minimized over an $n\times r$ factor matrix $X$, where $\phi$ is a smooth convex cost function defined over $n\times n$ matrices. The goal is to find a second-order stationary point $X$ in a reasonable amount of time. However, if $X$ is rank deficient, its rank deficiency certifies it as the globally optimal solution. This certification method requires the search rank $r$ of the current iterate $X$ to be overparameterized compared to the rank $r^{\star}$ of the global minimizer $X^{\star}$. Unfortunately, overparameterization slows down the convergence of gradient descent, from a linear rate with $r=r^{\star}$ to a sublinear rate when $r>r^{\star}$, even when $\phi$ is strongly convex. In this paper, we propose a cost-effective preconditioner that restores the linear convergence rate of gradient descent in the overparameterized case, while also accommodating possible ill-conditioning in the global minimizer $X^{\star}$.

[abs]

[pdf][bib]