Deep RL With Information Constrained Policies: Generalization in Continuous Control

10/09/2020
by   Tyler Malloy, et al.
0

Biological agents learn and act intelligently in spite of a highly limited capacity to process and store information. Many real-world problems involve continuous control, which represents a difficult task for artificial intelligence agents. In this paper we explore the potential learning advantages a natural constraint on information flow might confer onto artificial agents in continuous control tasks. We focus on the model-free reinforcement learning (RL) setting and formalize our approach in terms of an information-theoretic constraint on the complexity of learned policies. We show that our approach emerges in a principled fashion from the application of rate-distortion theory. We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm and situate it within a broader family of RL algorithms such as the Soft Actor Critic (SAC) and Mutual Information Reinforcement Learning (MIRL) algorithm. Our experiments using continuous control tasks show that compared to alternative approaches, CLAC offers improvements in generalization between training and modified test environments. This is achieved in the CLAC model while displaying the high sample efficiency of similar methods.

READ FULL TEXT

page 7

page 8

research
06/06/2019

Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies

Deep Reinforcement Learning (DRL) algorithms for continuous action space...
research
07/24/2020

Predictive Information Accelerates Learning in RL

The Predictive Information is the mutual information between the past an...
research
09/16/2022

Look where you look! Saliency-guided Q-networks for visual RL tasks

Deep reinforcement learning policies, despite their outstanding efficien...
research
11/30/2022

Efficient Reinforcement Learning (ERL): Targeted Exploration Through Action Saturation

Reinforcement Learning (RL) generally suffers from poor sample complexit...
research
04/13/2021

Bi-level Off-policy Reinforcement Learning for Volt/VAR Control Involving Continuous and Discrete Devices

In Volt/Var control (VVC) of active distribution networks(ADNs), both sl...
research
07/20/2021

Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning

We present DrQ-v2, a model-free reinforcement learning (RL) algorithm fo...
research
01/29/2019

Emergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN

Although recurrent neural networks (RNNs) for reinforcement learning (RL...

Please sign up or login with your details

Forgot password? Click here to reset