Pair Matching: When bandits meet stochastic block model

by   Christophe Giraud, et al.

The pair-matching problem appears in many applications where one wants to discover good matches between pairs of individuals. Formally, the set of individuals is represented by the nodes of a graph where the edges, unobserved at first, represent the good matches. The algorithm queries pairs of nodes and observes the presence/absence of edges. Its goal is to discover as many edges as possible with a fixed budget of queries. Pair-matching is a particular instance of multi-armed bandit problem in which the arms are pairs of individuals and the rewards are edges linking these pairs. This bandit problem is non-standard though, as each arm can only be played once. Given this last constraint, sublinear regret can be expected only if the graph presents some underlying structure. This paper shows that sublinear regret is achievable in the case where the graph is generated according to a Stochastic Block Model (SBM) with two communities. Optimal regret bounds are computed for this pair-matching problem. They exhibit a phase transition related to the Kesten-Stigund threshold for community detection in SBM. To avoid undesirable features of optimal solutions, the pair-matching problem is also considered in the case where each node is constrained to be sampled less than a given amount of times. We show how this constraint deteriorates optimal regret rates. The paper is concluded by a conjecture regarding the optimal regret when the number of communities is larger than 2. Contrary to the two communities case, we believe that a statistical-computational gap would appear in this problem.


Stochastic Multi-armed Bandits in Constant Space

We consider the stochastic bandit problem in the sublinear space setting...

Multi-Armed Bandits on Unit Interval Graphs

An online learning problem with side information on the similarity and d...

Memory-Constrained No-Regret Learning in Adversarial Bandits

An adversarial bandit problem with memory constraints is studied where o...

Asymptotic Optimality for Decentralised Bandits

We consider a large number of agents collaborating on a multi-armed band...

The Survival Bandit Problem

We study the survival bandit problem, a variant of the multi-armed bandi...

Pure Exploration and Regret Minimization in Matching Bandits

Finding an optimal matching in a weighted graph is a standard combinator...

The Influence of Shape Constraints on the Thresholding Bandit Problem

We investigate the stochastic Thresholding Bandit problem (TBP) under se...

Please sign up or login with your details

Forgot password? Click here to reset