Homomorphically Encrypted Linear Contextual Bandit

03/17/2021
by   Evrard Garcelon, et al.
0

Contextual bandit is a general framework for online learning in sequential decision-making problems that has found application in a large range of domains, including recommendation system, online advertising, clinical trials and many more. A critical aspect of bandit methods is that they require to observe the contexts – i.e., individual or group-level data – and the rewards in order to solve the sequential problem. The large deployment in industrial applications has increased interest in methods that preserve the privacy of the users. In this paper, we introduce a privacy-preserving bandit framework based on asymmetric encryption. The bandit algorithm only observes encrypted information (contexts and rewards) and has no ability to decrypt it. Leveraging homomorphic encryption, we show that despite the complexity of the setting, it is possible to learn over encrypted data. We introduce an algorithm that achieves a O(d√(T)) regret bound in any linear contextual bandit problem, while keeping data encrypted.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/13/2020

Contextual Bandit with Missing Rewards

We consider a novel variant of the contextual bandit problem (i.e., the ...
research
05/04/2019

Tight Regret Bounds for Infinite-armed Linear Contextual Bandits

Linear contextual bandit is a class of sequential decision making proble...
research
06/05/2021

Robust Stochastic Linear Contextual Bandits Under Adversarial Attacks

Stochastic linear contextual bandit algorithms have substantial applicat...
research
08/06/2021

Joint AP Probing and Scheduling: A Contextual Bandit Approach

We consider a set of APs with unknown data rates that cooperatively serv...
research
02/03/2018

Adaptive Representation Selection in Contextual Bandit with Unlabeled History

We consider an extension of the contextual bandit setting, motivated by ...
research
02/02/2023

Practical Bandits: An Industry Perspective

The bandit paradigm provides a unified modeling framework for problems t...
research
01/10/2018

A Smoothed Analysis of the Greedy Algorithm for the Linear Contextual Bandit Problem

Bandit learning is characterized by the tension between long-term explor...

Please sign up or login with your details

Forgot password? Click here to reset