Regret Analysis of the Anytime Optimally Confident UCB Algorithm

03/29/2016
by   Tor Lattimore, et al.
0

I introduce and analyse an anytime version of the Optimally Confident UCB (OCUCB) algorithm designed for minimising the cumulative regret in finite-armed stochastic bandits with subgaussian noise. The new algorithm is simple, intuitive (in hindsight) and comes with the strongest finite-time regret guarantees for a horizon-free algorithm so far. I also show a finite-time lower bound that nearly matches the upper bound.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/30/2019

Nearly Minimax-Optimal Regret for Linearly Parameterized Bandits

We study the linear contextual bandit problem with finite action sets. W...
research
11/18/2015

Regret Analysis of the Finite-Horizon Gittins Index Strategy for Multi-Armed Bandits

I analyse the frequentist regret of the famous Gittins index strategy fo...
research
05/24/2019

Polynomial Cost of Adaptation for X -Armed Bandits

In the context of stochastic continuum-armed bandits, we present an algo...
research
04/10/2019

Testing Unateness Nearly Optimally

We present an Õ(n^2/3/ϵ^2)-query algorithm that tests whether an unknown...
research
10/25/2018

Almost Optimal Algorithms for Linear Stochastic Bandits with Heavy-Tailed Payoffs

In linear stochastic bandits, it is commonly assumed that payoffs are wi...
research
10/11/2022

Regret Analysis of the Stochastic Direct Search Method for Blind Resource Allocation

Motivated by programmatic advertising optimization, we consider the task...
research
04/28/2020

Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs

In this paper, we study the problem of stochastic linear bandits with fi...

Please sign up or login with your details

Forgot password? Click here to reset