Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

07/20/2021
by   Denis Yarats, et al.
15

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm for visual continuous control. DrQ-v2 builds on DrQ, an off-policy actor-critic approach that uses data augmentation to learn directly from pixels. We introduce several improvements that yield state-of-the-art results on the DeepMind Control Suite. Notably, DrQ-v2 is able to solve complex humanoid locomotion tasks directly from pixel observations, previously unattained by model-free RL. DrQ-v2 is conceptually simple, easy to implement, and provides significantly better computational footprint compared to prior work, with the majority of tasks taking just 8 hours to train on a single GPU. Finally, we publicly release DrQ-v2's implementation to provide RL practitioners with a strong and computationally efficient baseline.

READ FULL TEXT

page 2

page 7

page 8

page 10

page 11

page 12

03/15/2021

Sample-efficient Reinforcement Learning Representation Learning with Curiosity Contrastive Forward Dynamics Model

Developing an agent in reinforcement learning (RL) that is capable of pe...
04/28/2020

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

We propose a simple data augmentation technique that can be applied to s...
10/11/2021

Recurrent Model-Free RL is a Strong Baseline for Many POMDPs

Many problems in RL, such as meta RL, robust RL, and generalization in R...
10/09/2020

Deep RL With Information Constrained Policies: Generalization in Continuous Control

Biological agents learn and act intelligently in spite of a highly limit...
06/09/2021

Bayesian Bellman Operators

We introduce a novel perspective on Bayesian reinforcement learning (RL)...
07/20/2021

Proximal Policy Optimization for Tracking Control Exploiting Future Reference Information

In recent years, reinforcement learning (RL) has gained increasing atten...
07/24/2020

Predictive Information Accelerates Learning in RL

The Predictive Information is the mutual information between the past an...

Code Repositories

drqv2

DrQ-v2: Improved Data-Augmented Reinforcement Learning


view repo

Please sign up or login with your details

Forgot password? Click here to reset