Unlocking the Power of Representations in Long-term Novelty-based Exploration

05/02/2023
by   Alaa Saade, et al.
0

We introduce Robust Exploration via Clustering-based Online Density Estimation (RECODE), a non-parametric method for novelty-based exploration that estimates visitation counts for clusters of states based on their similarity in a chosen embedding space. By adapting classical clustering to the nonstationary setting of Deep RL, RECODE can efficiently track state visitation counts over thousands of episodes. We further propose a novel generalization of the inverse dynamics loss, which leverages masked transformer architectures for multi-step prediction; which in conjunction with RECODE achieves a new state-of-the-art in a suite of challenging 3D-exploration tasks in DM-Hard-8. RECODE also sets new state-of-the-art in hard exploration Atari games, and is the first agent to reach the end screen in "Pitfall!".

READ FULL TEXT

page 8

page 14

page 18

page 22

research
06/06/2019

Clustered Reinforcement Learning

Exploration strategy design is one of the challenging problems in reinfo...
research
10/11/2022

Exploration via Elliptical Episodic Bonuses

In recent years, a number of reinforcement learning (RL) methods have be...
research
06/05/2023

A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs

Exploration in environments which differ across episodes has received in...
research
06/06/2016

Unifying Count-Based Exploration and Intrinsic Motivation

We consider an agent's uncertainty about its environment and the problem...
research
07/03/2017

Hashing Over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning

In reinforcement learning (RL) tasks, an efficient exploration mechanism...
research
02/07/2019

Deeper & Sparser Exploration

We address the problem of efficient exploration by proposing a new meta ...
research
10/17/2019

Predicting retrosynthetic pathways using a combined linguistic model and hyper-graph exploration strategy

We present an extension of our Molecular Transformer architecture combin...

Please sign up or login with your details

Forgot password? Click here to reset