Cell-Free Latent Go-Explore

by   Quentin Gallouédec, et al.
Ecole Centrale de Lyon

In this paper, we introduce Latent Go-Explore (LGE), a simple and general approach based on the Go-Explore paradigm for exploration in reinforcement learning (RL). Go-Explore was initially introduced with a strong domain knowledge constraint for partitioning the state space into cells. However, in most real-world scenarios, drawing domain knowledge from raw observations is complex and tedious. If the cell partitioning is not informative enough, Go-Explore can completely fail to explore the environment. We argue that the Go-Explore approach can be generalized to any environment without domain knowledge and without cells by exploiting a learned latent representation. Thus, we show that LGE can be flexibly combined with any strategy for learning a latent representation. We show that LGE, although simpler than Go-Explore, is more robust and outperforms all state-of-the-art algorithms in terms of pure exploration on multiple hard-exploration environments. The LGE implementation is available as open-source at https://github.com/qgallouedec/lge.


page 2

page 4

page 6

page 12


Long-Term Exploration in Persistent MDPs

Exploration is an essential part of reinforcement learning, which restri...

Time-Myopic Go-Explore: Learning A State Representation for the Go-Explore Paradigm

Very large state spaces with a sparse reward signal are difficult to exp...

Go-Explore: a New Approach for Hard-Exploration Problems

A grand challenge in reinforcement learning is intelligent exploration, ...

BYOL-Explore: Exploration by Bootstrapped Prediction

We present BYOL-Explore, a conceptually simple yet general approach for ...

Exploration via Elliptical Episodic Bonuses

In recent years, a number of reinforcement learning (RL) methods have be...

Latent World Models For Intrinsically Motivated Exploration

In this work we consider partially observable environments with sparse r...

Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Goal-Conditioned Hierarchical Reinforcement Learning (GCHRL) is a promis...

Please sign up or login with your details

Forgot password? Click here to reset