Controlling Wasserstein Distances by Kernel Norms in Compressive Statistical Learning

In the field of machine learning, comparing probability distributions is a crucial task. Two popular methods for measuring the distance between probability distributions are Maximum Mean Discrepancies (MMD) and Wasserstein distances. This paper focuses on establishing conditions under which the Wasserstein distance can be controlled by MMD norms.

The motivation for this work comes from compressive statistical learning (CSL) theory, which is a framework for resource-efficient large scale learning. In CSL, the training data is summarized in a single vector called a sketch, which captures the relevant information for the learning task. Drawing inspiration from existing CSL results, we introduce the Hölder Lower Restricted Isometric Property and demonstrate its guarantees for compressive statistical learning.

By exploring the relationship between MMD and Wasserstein distances, we provide guarantees for compressive statistical learning by introducing and studying the concept of Wasserstein regularity. This concept involves bounding a task-specific metric between probability distributions using a Wasserstein distance.