Private Reinforcement Learning with PAC and Regret Guarantees

09/18/2020
by   Giuseppe Vietri, et al.
13

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL). We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)–a strong variant of differential privacy for settings where each user receives their own sets of output (e.g., policy recommendations). We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee. Our algorithm only pays for a moderate privacy cost on exploration: in comparison to the non-private bounds, the privacy parameter only appears in lower-order terms. Finally, we present lower bounds on sample complexity and regret for reinforcement learning subject to JDP.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/02/2022

Improved Regret for Differentially Private Exploration in Linear MDP

We study privacy-preserving exploration in sequential decision-making fo...
research
06/02/2022

Offline Reinforcement Learning with Differential Privacy

The offline reinforcement learning (RL) problem is often motivated by th...
research
12/09/2022

Near-Optimal Differentially Private Reinforcement Learning

Motivated by personalized healthcare and other applications involving se...
research
01/30/2019

Private Q-Learning with Functional Noise in Continuous Spaces

We consider privacy-preserving algorithms for deep reinforcement learnin...
research
03/18/2022

Privacy-Preserving Reinforcement Learning Beyond Expectation

Cyber and cyber-physical systems equipped with machine learning algorith...
research
12/20/2021

Differentially Private Regret Minimization in Episodic Markov Decision Processes

We study regret minimization in finite horizon tabular Markov decision p...
research
03/27/2018

Privacy-preserving Prediction

Ensuring differential privacy of models learned from sensitive user data...

Please sign up or login with your details

Forgot password? Click here to reset