Deep Hierarchical Planning from Pixels

06/08/2022
by   Danijar Hafner, et al.
0

Intelligent agents need to select long sequences of actions to solve complex tasks. While humans easily break down tasks into subgoals and reach them through millions of muscle commands, current artificial intelligence is limited to tasks with horizons of a few hundred decisions, despite large compute budgets. Research on hierarchical reinforcement learning aims to overcome this limitation but has proven to be challenging, current methods rely on manually specified goal spaces or subtasks, and no general solution exists. We introduce Director, a practical method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model. The high-level policy maximizes task and exploration rewards by selecting latent goals and the low-level policy learns to achieve the goals. Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization. Director outperforms exploration methods on tasks with sparse rewards, including 3D maze traversal with a quadruped robot from an egocentric camera and proprioception, without access to the global position or top-down view that was used by prior work. Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.

READ FULL TEXT

page 1

page 6

page 8

page 21

research
05/17/2022

Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in Latent Space

General-purpose robots require diverse repertoires of behaviors to compl...
research
10/09/2021

Interactive Hierarchical Guidance using Language

Reinforcement learning has been successful in many tasks ranging from ro...
research
10/05/2020

Mastering Atari with Discrete World Models

Intelligent agents need to generalize from past experience to achieve go...
research
10/12/2021

Planning from Pixels in Environments with Combinatorially Hard Search Spaces

The ability to form complex plans based on raw visual input is a litmus ...
research
09/21/2022

Hierarchical Decision Transformer

Sequence models in reinforcement learning require task knowledge to esti...
research
08/15/2019

Mapping State Space using Landmarks for Universal Goal Reaching

An agent that has well understood the environment should be able to appl...
research
11/23/2020

From Pixels to Legs: Hierarchical Learning of Quadruped Locomotion

Legged robots navigating crowded scenes and complex terrains in the real...

Please sign up or login with your details

Forgot password? Click here to reset