Dropout Training: Distributional Robustness and Optimal Solution
José Blanchet, Yang Kang, José Luis Montiel Olea, Viet Anh Nguyen, Xuhui Zhang; 24(180):1−60, 2023.
Abstract
This study demonstrates that dropout training in generalized linear models represents the minimax solution of a two-player, zero-sum game, where an adversarial force corrupts a statistician’s covariates by employing a multiplicative nonparametric errors-in-variables model. In this game, the least favorable distribution imposed by the adversarial force is dropout noise, wherein entries of the covariate vector are independently deleted with a fixed probability δ. This finding indicates that dropout training effectively provides expected loss guarantees for out-of-sample distributions resulting from multiplicative alterations of in-sample data. Moreover, the study proposes a specific recommendation for selecting the tuning parameter δ. Additionally, a novel, parallelizable, and unbiased multi-level Monte Carlo algorithm is presented to expedite the implementation of dropout training. Compared to the naive implementation of dropout, our algorithm offers significantly reduced computational costs, particularly when the number of data points is much smaller than the dimension of the covariate vector.
[abs]