BYOL-Explore: Exploration by Bootstrapped Prediction

06/16/2022
by   Zhaohan Daniel Guo, et al.
0

We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments. BYOL-Explore learns a world representation, the world dynamics, and an exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore s intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents.

READ FULL TEXT
research
09/03/2019

Making Efficient Use of Demonstrations to Solve Hard Exploration Problems

This paper introduces R2D3, an agent that makes efficient use of demonst...
research
06/09/2019

Curiosity-Driven Multi-Criteria Hindsight Experience Replay

Dealing with sparse rewards is a longstanding challenge in reinforcement...
research
11/18/2022

Curiosity in hindsight

Consider the exploration in sparse-reward or reward-free environments, s...
research
10/05/2020

Latent World Models For Intrinsically Motivated Exploration

In this work we consider partially observable environments with sparse r...
research
05/18/2021

Fixed β-VAE Encoding for Curious Exploration in Complex 3D Environments

Curiosity is a general method for augmenting an environment reward with ...
research
08/31/2022

Cell-Free Latent Go-Explore

In this paper, we introduce Latent Go-Explore (LGE), a simple and genera...
research
01/13/2023

Time-Myopic Go-Explore: Learning A State Representation for the Go-Explore Paradigm

Very large state spaces with a sparse reward signal are difficult to exp...

Please sign up or login with your details

Forgot password? Click here to reset