A Definition of Non-Stationary Bandits

02/23/2023
by   Yueyang Liu, et al.
0

The subject of non-stationary bandit learning has attracted much recent attention. However, non-stationary bandits lack a formal definition. Loosely speaking, non-stationary bandits have typically been characterized in the literature as those for which the reward distribution changes over time. We demonstrate that this informal definition is ambiguous. Further, a widely-used notion of regret – the dynamic regret – is motivated by this ambiguous definition and thus problematic. In particular, even for an optimal agent, dynamic regret can suggest poor performance. The ambiguous definition also motivates a measure of the degree of non-stationarity experienced by a bandit, which often overestimates and can give rise to extremely loose regret bounds. The primary contribution of this paper is a formal definition that resolves ambiguity. This definition motivates a new notion of regret, an alternative measure of the degree of non-stationarity, and a regret analysis that leads to tighter bounds for non-stationary bandit learning. The regret analysis applies to any bandit, stationary or non-stationary, and any agent.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2020

Unifying Clustered and Non-stationary Bandits

Non-stationary bandits and online clustering of bandits lift the restric...
research
05/29/2022

An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

We propose an algorithm for non-stationary kernel bandits that does not ...
research
03/09/2021

Non-stationary Linear Bandits Revisited

In this note, we revisit non-stationary linear bandits, a variant of sto...
research
06/28/2022

Dynamic Memory for Interpretable Sequential Optimisation

Real-world applications of reinforcement learning for recommendation and...
research
05/25/2022

Non-stationary Bandits with Knapsacks

In this paper, we study the problem of bandits with knapsacks (BwK) in a...
research
03/04/2023

MNL-Bandit in non-stationary environments

In this paper, we study the MNL-Bandit problem in a non-stationary envir...
research
02/08/2023

Non-Stationary Bandits with Knapsack Problems with Advice

We consider a non-stationary Bandits with Knapsack problem. The outcome ...

Please sign up or login with your details

Forgot password? Click here to reset