On-Demand Sampling: Learning Optimally from Multiple Distributions

by   Nika Haghtalab, et al.
berkeley college

Social and real-world considerations such as robustness, fairness, social welfare and multi-agent tradeoffs have given rise to multi-distribution learning paradigms, such as collaborative, group distributionally robust, and fair federated learning. In each of these settings, a learner seeks to minimize its worst-case loss over a set of n predefined distributions, while using as few samples as possible. In this paper, we establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity. Importantly, our sample complexity bounds exceed that of the sample complexity of learning a single distribution only by an additive factor of n log(n) / ϵ^2. These improve upon the best known sample complexity of agnostic federated learning by Mohri et al. by a multiplicative factor of n, the sample complexity of collaborative learning by Nguyen and Zakynthinou by a multiplicative factor log n / ϵ^3, and give the first sample complexity bounds for the group DRO objective of Sagawa et al. To achieve optimal sample complexity, our algorithms learn to sample and learn from distributions on demand. Our algorithm design and analysis is enabled by our extensions of stochastic optimization techniques for solving stochastic zero-sum games. In particular, we contribute variants of Stochastic Mirror Descent that can trade off between players' access to cheap one-off samples or more expensive reusable ones.


page 1

page 2

page 3

page 4


Statistically Near-Optimal Hypothesis Selection

Hypothesis Selection is a fundamental distribution learning problem wher...

Stochastic Approximation Approaches to Group Distributionally Robust Optimization

This paper investigates group distributionally robust optimization (GDRO...

Tight Bounds for Collaborative PAC Learning via Multiplicative Weights

We study the collaborative PAC learning problem recently proposed in Blu...

Learning Ising Models with Independent Failures

We give the first efficient algorithm for learning the structure of an I...

Transfer Learning In Differential Privacy's Hybrid-Model

The hybrid-model (Avent et al 2017) in Differential Privacy is a an augm...

Generalizing Complex Hypotheses on Product Distributions: Auctions, Prophet Inequalities, and Pandora's Problem

This paper explores a theory of generalization for learning problems on ...

Optimistic optimization of a Brownian

We address the problem of optimizing a Brownian motion. We consider a (r...

Code Repositories


Official implementation of On-Demand Sampling: Learning Optimally from Multiple Distributions (Neurips 2022)

view repo

Please sign up or login with your details

Forgot password? Click here to reset