Pure Exploration and Regret Minimization in Matching Bandits

07/31/2021
by   Flore Sentenac, et al.
4

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/28/2019

Combinatorial Bandits with Full-Bandit Feedback: Sample Complexity and Regret Minimization

Combinatorial Bandits generalize multi-armed bandits, where k out of n a...
research
11/01/2020

Experimental Design for Regret Minimization in Linear Bandits

In this paper we propose a novel experimental design-based algorithm to ...
research
08/02/2022

Unimodal Mono-Partite Matching in a Bandit Setting

We tackle a new emerging problem, which is finding an optimal monopartit...
research
08/02/2022

UniRank: Unimodal Bandit Algorithm for Online Ranking

We tackle a new emerging problem, which is finding an optimal monopartit...
research
02/17/2020

Statistically Efficient, Polynomial Time Algorithms for Combinatorial Semi Bandits

We consider combinatorial semi-bandits over a set of arms X⊂{0,1}^d wher...
research
01/31/2023

Combinatorial Causal Bandits without Graph Skeleton

In combinatorial causal bandits (CCB), the learning agent chooses a subs...
research
05/17/2019

Pair Matching: When bandits meet stochastic block model

The pair-matching problem appears in many applications where one wants t...

Please sign up or login with your details

Forgot password? Click here to reset