Finding the Bandit in a Graph: Sequential Search-and-Stop

06/06/2018
by   Pierre Perrault, et al.
0

We consider the problem where an agent wants to find a hidden object that is randomly located in some vertex of a directed acyclic graph (DAG) according to a fixed but possibly unknown distribution. The agent can only examine vertices whose in-neighbors have already been examined. In scheduling theory, this problem is denoted by 1|prec|∑ w_jC_j. However, in this paper, we address learning setting where we allow the agent to stop before having found the object and restart searching on a new independent instance of the same problem. The goal is to maximize the total number of hidden objects found under a time constraint. The agent can thus skip an instance after realizing that it would spend too much time on it. Our contributions are both to the search theory and multi-armed bandits. If the distribution is known, we provide a quasi-optimal greedy strategy with the help of known computationally efficient algorithms for solving 1|prec|∑ w_jC_j under some assumption on the DAG. If the distribution is unknown, we show how to sequentially learn it and, at the same time, act near-optimally in order to collect as many hidden objects as possible. We provide an algorithm, prove theoretical guarantees, and empirically show that it outperforms the naïve baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/07/2019

Individual Regret in Cooperative Nonstochastic Multi-Armed Bandits

We study agents communicating over an underlying network by exchanging m...
research
12/29/2022

Graph Searching with Predictions

Consider an agent exploring an unknown graph in search of some goal stat...
research
01/04/2021

Be Greedy in Multi-Armed Bandits

The Greedy algorithm is the simplest heuristic in sequential decision pr...
research
04/12/2021

Censored Semi-Bandits for Resource Allocation

We consider the problem of sequentially allocating resources in a censor...
research
11/16/2020

Distributed Bandits: Probabilistic Communication on d-regular Graphs

We study the decentralized multi-agent multi-armed bandit problem for ag...
research
11/20/2019

Exact and approximation algorithms for the expanding search problem

Suppose a target is hidden in one of the vertices of an edge-weighted gr...
research
05/05/2020

On Reachable Assignments in Cycles and Cliques

The efficient and fair distribution of indivisible resources among agent...

Please sign up or login with your details

Forgot password? Click here to reset