VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit Feedback
Kirthevasan Kandasamy, Joseph E Gonzalez, Michael I Jordan, Ion Stoica; 24(53):1−45, 2023.
Abstract
This study focuses on a multi-round mechanism design problem that aims to maximize welfare in situations where agents are unaware of their own values. In each round, a mechanism assigns an allocation to a group of agents and charges them a price. At the end of each round, the agents provide stochastic feedback to the mechanism regarding the allocation they received. This scenario is particularly relevant in cloud markets and online advertising, where an agent may only determine their value for an allocation after experiencing it. Consequently, the mechanism needs to explore different allocations for each agent to learn their values while simultaneously identifying the socially optimal set of allocations. The objective is to develop truthful and individually rational mechanisms that replicate the classical VCG mechanism in the long run. To achieve this, three notions of regret are defined for welfare, individual utilities of each agent, and the mechanism itself. It is demonstrated that these three terms are interdependent, with an Omega(T^(2/3)) lower bound established for the maximum value of these three terms after T rounds of allocations. Additionally, an algorithm is described that essentially achieves this rate. The framework also provides flexibility to control the pricing scheme, allowing for a trade-off between agent and seller regrets. Asymptotic variants for truthfulness and individual rationality are defined, and asymptotic rates are provided to quantify the degree to which both properties are satisfied by the proposed algorithm.
[abs]