Ill-Posedness of Maximum Likelihood Estimation in Gaussian Process Regression
Toni Karvonen, Chris J. Oates; 24(120):1−47, 2023.
Abstract
Gaussian process regression is widely used in machine learning and statistics, and maximum likelihood estimation is commonly employed to determine the parameters for the covariance kernel. However, the question of when maximum likelihood estimation is well-posed, meaning that small perturbations of the data do not significantly affect the predictions of the regression model, remains an open problem. This article presents an analysis of scenarios in which the maximum likelihood estimator fails to be well-posed. Specifically, it is shown that the predictive distributions are not Lipschitz in the data with respect to the Hellinger distance when estimating the lengthscale parameter of a Gaussian process with a stationary covariance function using maximum likelihood. While the failure of maximum likelihood estimation in Gaussian process regression is well-known, this article provides rigorous theoretical results on the topic. The implications of these negative findings suggest that well-posedness may need to be assessed on a case-by-case basis when using maximum likelihood estimation to train a Gaussian process model.
[abs]