Towards Instance Optimal Bounds for Best Arm Identification

08/22/2016
by   Lijie Chen, et al.
0

In the classical best arm identification (Best-1-Arm) problem, we are given n stochastic bandit arms, each associated with a reward distribution with an unknown mean. We would like to identify the arm with the largest mean with probability at least 1-δ, using as few samples as possible. Understanding the sample complexity of Best-1-Arm has attracted significant attention since the last decade. However, the exact sample complexity of the problem is still unknown. Recently, Chen and Li made the gap-entropy conjecture concerning the instance sample complexity of Best-1-Arm. Given an instance I, let μ_[i] be the ith largest mean and Δ_[i]=μ_[1]-μ_[i] be the corresponding gap. H(I)=∑_i=2^nΔ_[i]^-2 is the complexity of the instance. The gap-entropy conjecture states that Ω(H(I)·(δ^-1+Ent(I))) is an instance lower bound, where Ent(I) is an entropy-like term determined by the gaps, and there is a δ-correct algorithm for Best-1-Arm with sample complexity O(H(I)·(δ^-1+Ent(I))+Δ_[2]^-2Δ_[2]^-1). If the conjecture is true, we would have a complete understanding of the instance-wise sample complexity of Best-1-Arm. We make significant progress towards the resolution of the gap-entropy conjecture. For the upper bound, we provide a highly nontrivial algorithm which requires O(H(I)·(δ^-1 +Ent(I))+Δ_[2]^-2Δ_[2]^-1polylog(n,δ^-1)) samples in expectation. For the lower bound, we show that for any Gaussian Best-1-Arm instance with gaps of the form 2^-k, any δ-correct monotone algorithm requires Ω(H(I)·(δ^-1 + Ent(I))) samples in expectation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/13/2017

Nearly Instance Optimal Sample Complexity Bounds for Top-k Arm Selection

In the Best-k-Arm problem, we are given n stochastic bandit arms, each a...
research
04/11/2022

Approximate Top-m Arm Identification with Heterogeneous Reward Variances

We study the effect of reward variance heterogeneity in the approximate ...
research
04/08/2023

Best Arm Identification with Fairness Constraints on Subpopulations

We formulate, analyze and solve the problem of best arm identification w...
research
08/24/2019

Optimal best arm selection for general distributions

Given a finite set of unknown distributions or arms that can be sampled ...
research
01/25/2022

Almost Optimal Variance-Constrained Best Arm Identification

We design and analyze VA-LUCB, a parameter-free algorithm, for identifyi...
research
11/13/2017

Thresholding Bandit for Dose-ranging: The Impact of Monotonicity

We analyze the sample complexity of the thresholding bandit problem, wit...
research
10/03/2022

Dealing with Unknown Variances in Best-Arm Identification

The problem of identifying the best arm among a collection of items havi...

Please sign up or login with your details

Forgot password? Click here to reset