On Optimal Robustness to Adversarial Corruption in Online Decision Problems

09/22/2021
by   Shinji Ito, et al.
0

This paper considers two fundamental sequential decision-making problems: the problem of prediction with expert advice and the multi-armed bandit problem. We focus on stochastic regimes in which an adversary may corrupt losses, and we investigate what level of robustness can be achieved against adversarial corruptions. The main contribution of this paper is to show that optimal robustness can be expressed by a square-root dependency on the amount of corruption. More precisely, we show that two classes of algorithms, anytime Hedge with decreasing learning rate and algorithms with second-order regret bounds, achieve O( log N/Δ + √(C log N /Δ) )-regret, where N, Δ, and C represent the number of experts, the gap parameter, and the corruption level, respectively. We further provide a matching lower bound, which means that this regret bound is tight up to a constant factor. For the multi-armed bandit problem, we also provide a nearly tight lower bound up to a logarithmic factor.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/14/2022

Adversarially Robust Multi-Armed Bandit Algorithm with Variance-Dependent Regret Bounds

This paper considers the multi-armed bandit (MAB) problem and provides a...
research
08/15/2023

Regret Lower Bounds in Multi-agent Multi-armed Bandit

Multi-armed Bandit motivates methods with provable upper bounds on regre...
research
05/26/2023

Stability-penalty-adaptive Follow-the-regularized-leader: Sparsity, Game-dependency, and Best-of-both-worlds

Adaptivity to the difficulties of a problem is a key property in sequent...
research
08/28/2021

Self-fulfilling Bandits: Endogeneity Spillover and Dynamic Selection in Algorithmic Decision-making

In this paper, we study endogeneity problems in algorithmic decision-mak...
research
12/07/2018

Online Learning and Decision-Making under Generalized Linear Model with High-Dimensional Data

We propose a minimax concave penalized multi-armed bandit algorithm unde...
research
11/26/2015

Gains and Losses are Fundamentally Different in Regret Minimization: The Sparse Case

We demonstrate that, in the classical non-stochastic regret minimization...
research
06/22/2020

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

We study regret minimization in a stochastic multi-armed bandit setting ...

Please sign up or login with your details

Forgot password? Click here to reset