Improved Confidence Bounds for the Linear Logistic Model and Applications to Linear Bandits

11/23/2020
by   Kwang-Sung Jun, et al.
17

We propose improved fixed-design confidence bounds for the linear logistic model. Our bounds significantly improve upon the state-of-the-art bounds of Li et al. (2017) by leveraging the self-concordance of the logistic loss inspired by Faury et al. (2020). Specifically, our confidence width does not scale with the problem dependent parameter 1/κ, where κ is the worst-case variance of an arm reward. At worse, κ scales exponentially with the norm of the unknown linear parameter θ^*. Instead, our bound scales directly on the local variance induced by θ^*. We present two applications of our novel bounds on two logistic bandit problems: regret minimization and pure exploration. Our analysis shows that the new confidence bounds improve upon previous state-of-the-art performance guarantees.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/04/2022

An Experimental Design Approach for Regret Minimization in Logistic Bandits

In this work we consider the problem of regret minimization for logistic...
research
02/14/2011

Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems

The analysis of online least squares estimation is at the heart of many ...
research
02/18/2020

Improved Optimistic Algorithms for Logistic Bandits

The generalized linear bandit framework has attracted a lot of attention...
research
02/28/2022

Bandit Learning with General Function Classes: Heteroscedastic Noise and Variance-dependent Regret Bounds

We consider learning a stochastic bandit model, where the reward functio...
research
10/23/2020

Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently attracted substantial attention, by provi...
research
01/06/2022

Jointly Efficient and Optimal Algorithms for Logistic Bandits

Logistic Bandits have recently undergone careful scrutiny by virtue of t...
research
06/17/2016

Structured Stochastic Linear Bandits

The stochastic linear bandit problem proceeds in rounds where at each ro...

Please sign up or login with your details

Forgot password? Click here to reset