The Sample Complexity of Multi-Distribution Learning for VC Classes
Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension d class on k distributions to be O(ϵ^-2ln(k)(d + k) + min{ϵ^-1 dk, ϵ^-4ln(k) d}), the best lower bound is Ω(ϵ^-2(d + k ln(k))). We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.
READ FULL TEXT