Efficient Wasserstein Natural Gradients for Reinforcement Learning

10/12/2020
by   Ted Moskovitz, et al.
0

A novel optimization approach is proposed for application to policy gradient methods and evolution strategies for reinforcement learning (RL). The procedure uses a computationally efficient Wasserstein natural gradient (WNG) descent that takes advantage of the geometry induced by a Wasserstein penalty to speed optimization. This method follows the recent theme in RL of including a divergence penalty in the objective to establish a trust region. Experiments on challenging tasks demonstrate improvements in both computational cost and performance over advanced baselines.

READ FULL TEXT

page 8

page 13

research
06/11/2019

Wasserstein Reinforcement Learning

We propose behavior-driven optimization via Wasserstein distances (WDs) ...
research
09/05/2022

Natural Policy Gradients In Reinforcement Learning Explained

Traditional policy gradient methods are fundamentally flawed. Natural gr...
research
08/09/2018

Policy Optimization as Wasserstein Gradient Flows

Policy optimization is a core component of reinforcement learning (RL), ...
research
12/19/2017

On Wasserstein Reinforcement Learning and the Fokker-Planck equation

Policy gradients methods often achieve better performance when the chang...
research
10/21/2019

Kernelized Wasserstein Natural Gradient

Many machine learning problems can be expressed as the optimization of s...
research
10/28/2020

Learning to Unknot

We introduce natural language processing into the study of knot theory, ...
research
05/17/2019

Stochastically Dominant Distributional Reinforcement Learning

We describe a new approach for mitigating risk in the Reinforcement Lear...

Please sign up or login with your details

Forgot password? Click here to reset