Policy Gradient Methods and the Nash Equilibrium in N-player General-sum Linear-quadratic Games

Authors: Ben Hambly, Renyuan Xu, Huining Yang; Journal of Machine Learning Research, 24(139):1−56, 2023.

Abstract

This study focuses on the convergence of the natural policy gradient method to the Nash equilibrium in a general-sum N-player linear-quadratic game with stochastic dynamics over a finite horizon. We establish that the method can achieve global convergence, provided that a certain level of noise is present in the system. To ensure convergence, we introduce a condition that sets a lower bound on the noise covariance in relation to the model parameters. We support our findings with numerical experiments, demonstrating that the addition of noise enables convergence even in scenarios where the policy gradient method would not converge in a deterministic setting.

[abs]

[pdf][bib]