Logarithmic Bayes Regret Bounds

06/15/2023
by   Alexia Atsidakou, et al.
0

We derive the first finite-time logarithmic regret bounds for Bayesian bandits. For Gaussian bandits, we obtain a O(c_h log^2 n) bound, where c_h is a prior-dependent constant. This matches the asymptotic lower bound of Lai (1987). Our proofs mark a technical departure from prior works, and are simple and general. To show generality, we apply our technique to linear bandits. Our bounds shed light on the value of the prior in the Bayesian setting, both in the objective and as a side information given to the learner. They significantly improve the Õ(√(n)) bounds, that despite the existing lower bounds, have become standard in the literature.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/01/2020

Advances in Bandits with Knapsacks

"Bandits with Knapsacks" () is a general model for multi-armed bandits u...
research
06/02/2023

A Convex Relaxation Approach to Bayesian Regret Minimization in Offline Bandits

Algorithms for offline bandits must optimize decisions in uncertain envi...
research
10/27/2015

Online Learning with Gaussian Payoffs and Side Observations

We consider a sequential learning problem with Gaussian payoffs and side...
research
03/01/2023

On Parametric Misspecified Bayesian Cramér-Rao bound: An application to linear Gaussian systems

A lower bound is an important tool for predicting the performance that a...
research
06/20/2019

Sequential Experimental Design for Transductive Linear Bandits

In this paper we introduce the transductive linear bandit problem: given...
research
07/13/2019

Preselection Bandits under the Plackett-Luce Model

In this paper, we introduce the Preselection Bandit problem, in which th...
research
07/06/2023

Optimal Scalarizations for Sublinear Hypervolume Regret

Scalarization is a general technique that can be deployed in any multiob...

Please sign up or login with your details

Forgot password? Click here to reset