Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications

by   Johannes Kirschner, et al.

Partial monitoring is an expressive framework for sequential decision-making with an abundance of applications, including graph-structured and dueling bandits, dynamic pricing and transductive feedback models. We survey and extend recent results on the linear formulation of partial monitoring that naturally generalizes the standard linear bandit setting. The main result is that a single algorithm, information-directed sampling (IDS), is (nearly) worst-case rate optimal in all finite-action games. We present a simple and unified analysis of stochastic partial monitoring, and further extend the model to the contextual and kernelized setting.


page 1

page 2

page 3

page 4


Information Directed Sampling for Linear Partial Monitoring

Partial monitoring is a rich framework for sequential decision making un...

Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring

We investigate finite stochastic partial monitoring, which is a general ...

Toward a Classification of Finite Partial-Monitoring Games

Partial-monitoring games constitute a mathematical framework for sequent...

The End of Optimism? An Asymptotic Analysis of Finite-Armed Linear Bandits

Stochastic linear bandits are a natural and simple generalisation of fin...

Metalearning Linear Bandits by Prior Update

Fully Bayesian approaches to sequential decision-making assume that prob...

Parallelizing Contextual Linear Bandits

Standard approaches to decision-making under uncertainty focus on sequen...

Cleaning up the neighborhood: A full classification for adversarial partial monitoring

Partial monitoring is a generalization of the well-known multi-armed ban...

Please sign up or login with your details

Forgot password? Click here to reset