Lower Bounds and Accelerated Algorithms for Bilevel Optimization
Kaiyi ji, Yingbin Liang; 24(22):1−56, 2023.
Abstract
Bilevel optimization has recently gained increased attention due to its wide range of applications in modern machine learning problems. While the convergence rates of several popular algorithms have been characterized in recent studies, it remains unclear how much further these convergence rates can be improved. This paper addresses this fundamental question from two perspectives. First, it provides the first-known lower complexity bounds of approximately $\widetilde \Omega\left(\sqrt{\frac{L_y\widetilde L_{xy}^2}{\mu_x\mu_y^2}}\right)$ and $\widetilde \Omega\left(\frac{1}{\sqrt{\epsilon}}\min\{\kappa_y,\frac{1}{\sqrt{\epsilon^{3}}}\}\right)$ for strongly-convex-strongly-convex and convex-strongly-convex bilevel optimizations, respectively. Second, it proposes an accelerated bilevel optimizer named AccBiO, for which it provides the first-known complexity bounds without the gradient boundedness assumption (which was made in existing analyses) under the two aforementioned geometries. The paper also provides significantly tighter upper bounds than the existing complexity when the bounded gradient assumption does hold. It is shown that AccBiO achieves optimal results, where the upper and lower bounds match up to logarithmic factors, when the inner-level problem takes a quadratic form with a constant-level condition number. Interestingly, the lower bounds under both geometries are larger than the corresponding optimal complexities of minimax optimization, establishing that bilevel optimization is provably more challenging than minimax optimization. The theoretical results are validated by numerical experiments.
[abs]