Latent World Models For Intrinsically Motivated Exploration

10/05/2020
by   Aleksandr Ermolov, et al.
Università di Trento
0

In this work we consider partially observable environments with sparse rewards. We present a self-supervised representation learning method for image-based observations, which arranges embeddings respecting temporal distance of observations. This representation is empirically robust to stochasticity and suitable for novelty detection from the error of a predictive forward model. We consider episodic and life-long uncertainties to guide the exploration. We propose to estimate the missing information about the environment with the world model, which operates in the learned latent space. As a motivation of the method, we analyse the exploration problem in a tabular Partially Observable Labyrinth. We demonstrate the method on image-based hard exploration environments from the Atari benchmark and report significant improvement with respect to prior work. The source code of the method and all the experiments is available at https://github.com/htdt/lwm.

READ FULL TEXT
04/07/2022

Temporal Alignment for History Representation in Reinforcement Learning

Environments in Reinforcement Learning are usually only partially observ...
08/10/2020

GRIMGEP: Learning Progress for Robust Goal Sampling in Visual Deep Reinforcement Learning

Autonomous agents using novelty based goal exploration are often efficie...
06/16/2022

BYOL-Explore: Exploration by Bootstrapped Prediction

We present BYOL-Explore, a conceptually simple yet general approach for ...
10/17/2020

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

Efficient exploration remains a challenging problem in reinforcement lea...
10/02/2018

EMI: Exploration with Mutual Information Maximizing State and Action Embeddings

Policy optimization struggles when the reward feedback signal is very sp...
08/31/2022

Cell-Free Latent Go-Explore

In this paper, we introduce Latent Go-Explore (LGE), a simple and genera...
01/17/2023

Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Open Information Extraction models have shown promising results with suf...

Code Repositories

lwm

Official repository of the paper Latent World Models For Intrinsically Motivated Exploration


view repo

Please sign up or login with your details

Forgot password? Click here to reset