Differentially Private Reinforcement Learning with Linear Function Approximation

01/18/2022
by   Xingyu Zhou, et al.
0

Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services, where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/20/2021

Differentially Private Regret Minimization in Episodic Markov Decision Processes

We study regret minimization in finite horizon tabular Markov decision p...
research
10/19/2021

Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

Reinforcement learning (RL) algorithms can be used to provide personaliz...
research
08/26/2021

Adaptive Control of Differentially Private Linear Quadratic Systems

In this paper, we study the problem of regret minimization in reinforcem...
research
10/15/2020

Local Differentially Private Regret Minimization in Reinforcement Learning

Reinforcement learning algorithms are widely used in domains where it is...
research
12/09/2022

Near-Optimal Differentially Private Reinforcement Learning

Motivated by personalized healthcare and other applications involving se...
research
12/02/2021

Differentially Private Exploration in Reinforcement Learning with Linear Representation

This paper studies privacy-preserving exploration in Markov Decision Pro...
research
12/30/2020

Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients

As reinforcement learning techniques are increasingly applied to real-wo...

Please sign up or login with your details

Forgot password? Click here to reset