GANs as Converging Gradient Flows

Authors: Yu-Jui Huang, Yuchong Zhang; Published in 2023; Journal of Machine Learning Research, 24(217):1−40.

Abstract

This study addresses the problem of unsupervised learning using gradient descent in the space of probability density functions. The main finding of this research is that when following the gradient flow induced by a distribution-dependent ordinary differential equation (ODE), the unknown data distribution is revealed as the long-term limit. In other words, simulating the distribution-dependent ODE allows us to discover the data distribution. Interestingly, it is shown that the simulation of the ODE is equivalent to training generative adversarial networks (GANs). This equivalence provides a fresh perspective on GANs and, more importantly, offers insights into the divergence of GANs. Specifically, it unveils that the GAN algorithm implicitly minimizes the mean squared error (MSE) between two sets of samples, and this MSE fitting alone can lead to divergence in GANs. To solve the distribution-dependent ODE, we establish that the associated nonlinear Fokker-Planck equation has a unique weak solution, employing the Crandall-Liggett theorem for differential equations in Banach spaces. Using this solution to the Fokker-Planck equation, we construct a unique solution to the ODE based on Trevisan’s superposition principle. The convergence of the induced gradient flow to the data distribution is achieved by analyzing the Fokker-Planck equation.

[Abstract]

[PDF][Bibliography]