Beyond Stationary Points: A Non-parametric Perspective on FedAvg and FedProx
Authors: Lili Su, Jiaming Xu, Pengkun Yang; Journal of Machine Learning Research (JMLR), 24(203):1−48, 2023.
Abstract
Federated Learning (FL) is a decentralized learning framework with promising applications in privacy preservation and reducing computation load at the cloud. Recent studies have revealed that FedAvg and FedProx, two widely-used FL algorithms, fail to converge to stationary points of the global optimization objective, even for homogeneous linear regression problems. Furthermore, concerns have been raised regarding the generalization capabilities of the common model learned in the presence of heterogeneity.
In this paper, we analyze the convergence and statistical efficiency of FedAvg and FedProx, addressing the aforementioned concerns. Our analysis is based on non-parametric regression in a reproducing kernel Hilbert space (RKHS), allowing for heterogeneous local data distributions and unbalanced local datasets. We prove that the estimation errors, measured in either the empirical norm or the RKHS norm, decay at a rate of $1/t$ in general and exponentially for finite-rank kernels. In certain heterogeneous settings, these upper bounds also imply that both FedAvg and FedProx achieve the optimal error rate.
To further quantify the impact of heterogeneity at each client, we propose and characterize a novel concept called “federation gain,” which measures the reduction in estimation error for a client upon joining the FL. We discover that when the data heterogeneity is moderate, a client with limited local data can benefit from a common model with a large federation gain.
By considering the statistical aspect, we introduce two new insights: (1) the requirement of a standard bounded dissimilarity is pessimistic for the convergence analysis of FedAvg and FedProx; (2) despite inconsistency of stationary points, their limiting points serve as unbiased estimators of the underlying truth. We validate our theoretical findings through numerical experiments.
[abs]