Near-Optimal Collaborative Learning in Bandits

05/31/2022
by   Clémence Réda, et al.
0

This paper introduces a general multi-agent bandit model in which each agent is facing a finite set of arms and may communicate with other agents through a central controller in order to identify, in pure exploration, or play, in regret minimization, its optimal arm. The twist is that the optimal arm for each agent is the arm with largest expected mixed reward, where the mixed reward of an arm is a weighted sum of the rewards of this arm for all agents. This makes communication between agents often necessary. This general setting allows to recover and extend several recent models for collaborative bandit learning, including the recently proposed federated learning with personalization (Shi et al., 2021). In this paper, we provide new lower bounds on the sample complexity of pure exploration and on the regret. We then propose a near-optimal algorithm for pure exploration. This algorithm is based on phased elimination with two novel ingredients: a data-dependent sampling scheme within each phase, aimed at matching a relaxation of the lower bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/22/2023

Doubly Adversarial Federated Bandits

We study a new non-stochastic federated multi-armed bandit problem with ...
research
10/16/2017

Fully adaptive algorithm for pure exploration in linear bandits

We propose the first fully-adaptive algorithm for pure exploration in li...
research
10/29/2021

Collaborative Pure Exploration in Kernel Bandit

In this paper, we formulate a Collaborative Pure Exploration in Kernel B...
research
05/28/2019

Combinatorial Bandits with Full-Bandit Feedback: Sample Complexity and Regret Minimization

Combinatorial Bandits generalize multi-armed bandits, where k out of n a...
research
06/07/2023

Optimal Fair Multi-Agent Bandits

In this paper, we study the problem of fair multi-agent multi-arm bandit...
research
03/09/2023

Communication-Efficient Collaborative Heterogeneous Bandits in Networks

The multi-agent multi-armed bandit problem has been studied extensively ...
research
05/08/2021

Pure Exploration Bandit Problem with General Reward Functions Depending on Full Distributions

In this paper, we study the pure exploration bandit model on general dis...

Please sign up or login with your details

Forgot password? Click here to reset