Adaptive-treed bandits

02/11/2013
by   Adam D. Bull, et al.
0

We describe a novel algorithm for noisy global optimisation and continuum-armed bandits, with good convergence properties over any continuous reward function having finitely many polynomial maxima. Over such functions, our algorithm achieves square-root regret in bandits, and inverse-square-root error in optimisation, without prior information. Our algorithm works by reducing these problems to tree-armed bandits, and we also provide new results in this setting. We show it is possible to adaptively combine multiple trees so as to minimise the regret, and also give near-matching lower bounds on the regret in terms of the zooming dimension.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/14/2016

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

Stochastic linear bandits are a natural and simple generalisation of fin...
research
07/04/2018

Factored Bandits

We introduce the factored bandits model, which is a framework for learni...
research
07/15/2019

A Dimension-free Algorithm for Contextual Continuum-armed Bandits

In contextual continuum-armed bandits, the contexts x and the arms y are...
research
02/26/2023

No-Regret Linear Bandits beyond Realizability

We study linear bandits when the underlying reward function is not linea...
research
05/24/2019

Polynomial Cost of Adaptation for X -Armed Bandits

In the context of stochastic continuum-armed bandits, we present an algo...
research
06/10/2015

On the Prior Sensitivity of Thompson Sampling

The empirically successful Thompson Sampling algorithm for stochastic ba...
research
10/05/2020

Diversity-Preserving K-Armed Bandits, Revisited

We consider the bandit-based framework for diversity-preserving recommen...

Please sign up or login with your details

Forgot password? Click here to reset