Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

05/30/2023
by   Ronshee Chawla, et al.
0

The study of collaborative multi-agent bandits has attracted significant attention recently. In light of this, we initiate the study of a new collaborative setting, consisting of N agents such that each agent is learning one of M stochastic multi-armed bandits to minimize their group cumulative regret. We develop decentralized algorithms which facilitate collaboration between the agents under two scenarios. We characterize the performance of these algorithms by deriving the per agent cumulative regret and group regret upper bounds. We also prove lower bounds for the group regret in this setting, which demonstrates the near-optimal behavior of the proposed algorithms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/26/2023

Collaborative Regret Minimization in Multi-Armed Bandits

In this paper, we study the collaborative learning model, which concerns...
research
02/29/2016

Collaborative Learning of Stochastic Bandits over a Social Network

We consider a collaborative online learning paradigm, wherein a group of...
research
04/21/2021

Searching with Opponent-Awareness

We propose Searching with Opponent-Awareness (SOA), an approach to lever...
research
07/07/2020

Robust Multi-Agent Multi-Armed Bandits

There has been recent interest in collaborative multi-agent bandits, whe...
research
11/22/2018

Bandits with Temporal Stochastic Constraints

We study the effect of impairment on stochastic multi-armed bandits and ...
research
08/15/2021

Batched Thompson Sampling for Multi-Armed Bandits

We study Thompson Sampling algorithms for stochastic multi-armed bandits...
research
06/15/2021

Collaborative Learning and Personalization in Multi-Agent Stochastic Linear Bandits

We consider the problem of minimizing regret in an N agent heterogeneous...

Please sign up or login with your details

Forgot password? Click here to reset