GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms

02/14/2018
by   Cédric Colas, et al.
0

In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems. Conversely, evolutionary and developmental methods focusing on exploration like novelty search, quality-diversity or goal exploration processes are less sample efficient during exploitation. In this paper, we present the GEP-PG approach, taking the best of both worlds by sequentially combining two variants of a goal exploration process and two variants of DDPG. We study the learning performance of these components and their combination on a low dimensional deceptive reward problem and on the larger Half-Cheetah benchmark. Among other things, we show that DDPG fails on the former and that GEP-PG obtains performance above the state-of-the-art on the latter.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2018

Off-Policy Deep Reinforcement Learning without Exploration

Reinforcement learning traditionally considers the task of balancing exp...
research
05/02/2022

Exploration in Deep Reinforcement Learning: A Survey

This paper reviews exploration techniques in deep reinforcement learning...
research
03/27/2019

Autoregressive Policies for Continuous Control Deep Reinforcement Learning

Reinforcement learning algorithms rely on exploration to discover new be...
research
11/22/2022

Efficient Exploration using Model-Based Quality-Diversity with Gradients

Exploration is a key challenge in Reinforcement Learning, especially in ...
research
10/21/2021

Anti-Concentrated Confidence Bonuses for Scalable Exploration

Intrinsic rewards play a central role in handling the exploration-exploi...
research
11/24/2022

Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in Hard Exploration Problems

A fascinating aspect of nature lies in its ability to produce a collecti...
research
03/13/2018

Policy Search in Continuous Action Domains: an Overview

Continuous action policy search, the search for efficient policies in co...

Please sign up or login with your details

Forgot password? Click here to reset