Generalized Bayesian Upper Confidence Bound with Approximate Inference for Bandit Problems

01/31/2022
by   Ziyi Huang, et al.
0

Bayesian bandit algorithms with approximate inference have been widely used in practice with superior performance. Yet, few studies regarding the fundamental understanding of their performances are available. In this paper, we propose a Bayesian bandit algorithm, which we call Generalized Bayesian Upper Confidence Bound (GBUCB), for bandit problems in the presence of approximate inference. Our theoretical analysis demonstrates that in Bernoulli multi-armed bandit, GBUCB can achieve O(√(T)(log T)^c) frequentist regret if the inference error measured by symmetrized Kullback-Leibler divergence is controllable. This analysis relies on a novel sensitivity analysis for quantile shifts with respect to inference errors. To our best knowledge, our work provides the first theoretical regret bound that is better than o(T) in the setting of approximate inference. Our experimental evaluations on multiple approximate inference settings corroborate our theory, showing that our GBUCB is consistently superior to BUCB and Thompson sampling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/14/2019

Thompson Sampling and Approximate Inference

We study the effects of approximate inference on the performance of Thom...
research
10/11/2019

Old Dog Learns New Tricks: Randomized UCB for Bandit Problems

We propose RandUCB, a bandit strategy that uses theoretically derived co...
research
12/28/2020

Lifelong Learning in Multi-Armed Bandits

Continuously learning and leveraging the knowledge accumulated from prio...
research
01/21/2021

An empirical evaluation of active inference in multi-armed bandits

A key feature of sequential decision making under uncertainty is a need ...
research
07/12/2019

Laplacian-regularized graph bandits: Algorithms and theoretical analysis

We study contextual multi-armed bandit problems in the case of multiple ...
research
09/23/2021

Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit

In this paper, we consider the multi-armed bandit problem with high-dime...
research
12/13/2022

Towards Efficient and Domain-Agnostic Evasion Attack with High-dimensional Categorical Inputs

Our work targets at searching feasible adversarial perturbation to attac...

Please sign up or login with your details

Forgot password? Click here to reset