Smoothness-Adaptive Stochastic Bandits

10/22/2019
by   Yonatan Gur, et al.
0

We consider the problem of non-parametric multi-armed bandits with stochastic covariates, where a key factor in determining the complexity of the problem and in the design of effective policies is the smoothness of payoff functions. Previous work treats this problem when the smoothness of payoff functions are a priori known. In practical settings, however, the smoothness that characterizes the class of functions to which payoff functions belong is not known in advance, and misspecification of this smoothness may cause the performance of existing methods to severely deteriorate. In this work, we address the challenge of adapting to a priori unknown smoothness in the payoff functions. Our approach is based on the notion of self-similarity that appears in the literature on adaptive non-parametric confidence intervals. We develop a procedure that infers a global smoothness parameter of the payoff functions based on collected observations, and establish that this procedure achieves rate-optimal performance up to logarithmic factors. We further extend this method in order to account for local complexity of the problem which depends on how smooth payoff functions are in different regions of the covariate space. We show that under reasonable assumptions on the way this smoothness changes over the covariate space, our method achieves significantly improved performance that is characterized by the local complexity of the problem as opposed to its global complexity.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2019

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

We study a nonparametric contextual bandit problem where the expected re...
research
11/22/2022

Transfer Learning for Contextual Multi-armed Bandits

Motivated by a range of applications, we study in this paper the problem...
research
03/27/2017

A Scale Free Algorithm for Stochastic Bandits with Bounded Kurtosis

Existing strategies for finite-armed stochastic bandits mostly depend on...
research
03/03/2020

Bounded Regret for Finitely Parameterized Multi-Armed Bandits

We consider the problem of finitely parameterized multi-armed bandits wh...
research
01/19/2021

Can smooth graphons in several dimensions be represented by smooth graphons on [0,1]?

A graphon that is defined on [0,1]^d and is Hölder(α) continuous for som...
research
05/07/2019

Sparse multiresolution representations with adaptive kernels

Reproducing kernel Hilbert spaces (RKHSs) are key elements of many non-p...
research
11/19/2019

Optimal Complexity and Certification of Bregman First-Order Methods

We provide a lower bound showing that the O(1/k) convergence rate of the...

Please sign up or login with your details

Forgot password? Click here to reset