Local Differentially Private Regret Minimization in Reinforcement Learning

10/15/2020
by   Evrard Garcelon, et al.
0

Reinforcement learning algorithms are widely used in domains where it is desirable to provide a personalized service. In these domains it is common that user data contains sensitive information that needs to be protected from third parties. Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side. We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We present an optimistic algorithm that simultaneously satisfies LDP requirements, and achieves sublinear regret. We also establish a lower bound for regret minimization in finite-horizon MDPs with LDP guarantees. These results show that while LDP is appealing in practical applications, the setting is inherently more complex. In particular, our results demonstrate that the cost of privacy is multiplicative when compared to non-private settings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2021

Differentially Private Regret Minimization in Episodic Markov Decision Processes

We study regret minimization in finite horizon tabular Markov decision p...
research
01/18/2022

Differentially Private Reinforcement Learning with Linear Function Approximation

Motivated by the wide adoption of reinforcement learning (RL) in real-wo...
research
10/19/2021

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Reinforcement learning (RL) algorithms can be used to provide personaliz...
research
06/01/2023

Differentially Private Episodic Reinforcement Learning with Heavy-tailed Rewards

In this paper, we study the problem of (finite horizon tabular) Markov d...
research
08/26/2021

Adaptive Control of Differentially Private Linear Quadratic Systems

In this paper, we study the problem of regret minimization in reinforcem...
research
05/10/2023

An Option-Dependent Analysis of Regret Minimization Algorithms in Finite-Horizon Semi-Markov Decision Processes

A large variety of real-world Reinforcement Learning (RL) tasks is chara...
research
06/02/2022

Offline Reinforcement Learning with Differential Privacy

The offline reinforcement learning (RL) problem is often motivated by th...

Please sign up or login with your details

Forgot password? Click here to reset