Transfer Learning for Contextual Multi-armed Bandits

11/22/2022
by   Changxiao Cai, et al.
0

Motivated by a range of applications, we study in this paper the problem of transfer learning for nonparametric contextual multi-armed bandits under the covariate shift model, where we have data collected on source bandits before the start of the target bandit learning. The minimax rate of convergence for the cumulative regret is established and a novel transfer learning algorithm that attains the minimax regret is proposed. The results quantify the contribution of the data from the source domains for learning in the target domain in the context of nonparametric contextual multi-armed bandits. In view of the general impossibility of adaptation to unknown smoothness, we develop a data-driven algorithm that achieves near-optimal statistical guarantees (up to a logarithmic factor) while automatically adapting to the unknown parameters over a large collection of parameter spaces under an additional self-similarity assumption. A simulation study is carried out to illustrate the benefits of utilizing the data from the auxiliary source domains for learning in the target domain.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2019

Batched Multi-armed Bandits Problem

In this paper, we study the multi-armed bandit problem in the batched se...
research
06/07/2019

Transfer Learning for Nonparametric Classification: Minimax Rate and Adaptive Classifier

Human learners have the natural ability to use knowledge gained in one s...
research
11/14/2022

Hypothesis Transfer in Bandits by Weighted Models

We consider the problem of contextual multi-armed bandits in the setting...
research
10/22/2019

Smoothness-Adaptive Stochastic Bandits

We consider the problem of non-parametric multi-armed bandits with stoch...
research
10/15/2020

Continuum-Armed Bandits: A Function Space Perspective

Continuum-armed bandits (a.k.a., black-box or 0^th-order optimization) i...
research
06/08/2021

Adaptive transfer learning

In transfer learning, we wish to make inference about a target populatio...
research
02/04/2021

Transfer Learning in Bandits with Latent Continuity

Structured stochastic multi-armed bandits provide accelerated regret rat...

Please sign up or login with your details

Forgot password? Click here to reset