Sublinear Optimal Policy Value Estimation in Contextual Bandits

12/12/2019
by   Weihao Kong, et al.
0

We study the problem of estimating the expected reward of the optimal policy in the stochastic disjoint linear bandit setting. We prove that for certain settings it is possible to obtain an accurate estimate of the optimal policy value even with a number of samples that is sublinear in the number that would be required to find a policy that realizes a value close to this optima. We establish nearly matching information theoretic lower bounds, showing that our algorithm achieves near optimal estimation error. Finally, we demonstrate the effectiveness of our algorithm on joke recommendation and cancer inhibition dosage selection problems using real datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/19/2023

Estimating Optimal Policy Value in General Linear Contextual Bandits

In many bandit problems, the maximal reward achievable by a policy is of...
research
06/03/2019

Model selection for contextual bandits

We introduce the problem of model selection for contextual bandits, wher...
research
06/30/2020

Delayed Q-update: A novel credit assignment technique for deriving an optimal operation policy for the Grid-Connected Microgrid

A microgrid is an innovative system that integrates distributed energy r...
research
07/05/2022

Instance-optimal PAC Algorithms for Contextual Bandits

In the stochastic contextual bandit setting, regret-minimizing algorithm...
research
01/06/2022

Learning Optimal Antenna Tilt Control Policies: A Contextual Linear Bandit Approach

Controlling antenna tilts in cellular networks is imperative to reach an...
research
07/21/2021

Design of Experiments for Stochastic Contextual Linear Bandits

In the stochastic linear contextual bandit setting there exist several m...
research
10/05/2022

Tractable Optimality in Episodic Latent MABs

We consider a multi-armed bandit problem with M latent contexts, where a...

Please sign up or login with your details

Forgot password? Click here to reset