How are policy gradient methods affected by the limits of control?

06/14/2022
by   Ingvar Ziemann, et al.
5

We study stochastic policy gradient methods from the perspective of control-theoretic limitations. Our main result is that ill-conditioned linear systems in the sense of Doyle inevitably lead to noisy gradient estimates. We also give an example of a class of stable systems in which policy gradient methods suffer from the curse of dimensionality. Our results apply to both state feedback and partially observed systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2018

Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms?

We study how the behavior of deep policy gradient algorithms reflects th...
research
07/16/2023

Feedback is All You Need: Real-World Reinforcement Learning with Approximate Physics-Based Models

We focus on developing efficient and reliable policy optimization strate...
research
11/20/2020

Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon

We explore reinforcement learning methods for finding the optimal policy...
research
01/03/2017

A K-fold Method for Baseline Estimation in Policy Gradient Algorithms

The high variance issue in unbiased policy-gradient methods such as VPG ...
research
11/03/2020

A Study of Policy Gradient on a Class of Exactly Solvable Models

Policy gradient methods are extensively used in reinforcement learning a...
research
07/21/2020

A Note on the Linear Convergence of Policy Gradient Methods

We revisit the finite time analysis of policy gradient methods in the si...
research
06/18/2020

Competitive Policy Optimization

A core challenge in policy optimization in competitive Markov decision p...

Please sign up or login with your details

Forgot password? Click here to reset