A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

by   Zhihan Xiong, et al.

We investigate the fixed-budget best-arm identification (BAI) problem for linear bandits in a potentially non-stationary environment. Given a finite arm set 𝒳⊂ℝ^d, a fixed budget T, and an unpredictable sequence of parameters {θ_t}_t=1^T, an algorithm will aim to correctly identify the best arm x^* := max_x∈𝒳x^⊤∑_t=1^Tθ_t with probability as high as possible. Prior work has addressed the stationary setting where θ_t = θ_1 for all t and demonstrated that the error probability decreases as exp(-T /ρ^*) for a problem-dependent constant ρ^*. But in many real-world A/B/n multivariate testing scenarios that motivate our work, the environment is non-stationary and an algorithm expecting a stationary setting can easily fail. For robust identification, it is well-known that if arms are chosen randomly and non-adaptively from a G-optimal design over 𝒳 at each time then the error probability decreases as exp(-TΔ^2_(1)/d), where Δ_(1) = min_x ≠ x^* (x^* - x)^⊤1/T∑_t=1^T θ_t. As there exist environments where Δ_(1)^2/ d ≪ 1/ ρ^*, we are motivated to propose a novel algorithm 𝖯1-𝖱𝖠𝖦𝖤 that aims to obtain the best of both worlds: robustness to non-stationarity and fast rates of identification in benign settings. We characterize the error probability of 𝖯1-𝖱𝖠𝖦𝖤 and demonstrate empirically that the algorithm indeed never performs worse than G-optimal design but compares favorably to the best algorithms in the stationary setting.


page 1

page 2

page 3

page 4

∙ 06/09/2021

Fixed-Budget Best-Arm Identification in Contextual Bandits: A Static-Adaptive Algorithm

We study the problem of best-arm identification (BAI) in contextual band...
∙ 03/16/2023

On the Existence of a Complexity in Fixed Budget Bandit Identification

In fixed budget bandit identification, an algorithm sequentially observe...
∙ 05/27/2021

Towards Minimax Optimal Best Arm Identification in Linear Bandits

We study the problem of best arm identification in linear bandits in the...
∙ 10/15/2020

Probabilistic Sequential Shrinking: A Best Arm Identification Algorithm for Stochastic Bandits with Corruptions

We consider a best arm identification (BAI) problem for stochastic bandi...
∙ 10/22/2020

Quantile Bandits for Best Arms Identification with Concentration Inequalities

We consider a variant of the best arm identification task in stochastic ...
∙ 02/15/2023

Best Arm Identification for Stochastic Rising Bandits

Stochastic Rising Bandits is a setting in which the values of the expect...
∙ 06/22/2022

Active Learning with Safety Constraints

Active learning methods have shown great promise in reducing the number ...

Please sign up or login with your details

Forgot password? Click here to reset