Planning from Pixels in Environments with Combinatorially Hard Search Spaces

10/12/2021
by   Marco Bagatella, et al.
0

The ability to form complex plans based on raw visual input is a litmus test for current capabilities of artificial intelligence, as it requires a seamless combination of visual processing and abstract algorithmic execution, two traditionally separate areas of computer science. A recent surge of interest in this field brought advances that yield good performance in tasks ranging from arcade games to continuous control; these methods however do not come without significant issues, such as limited generalization capabilities and difficulties when dealing with combinatorially hard planning instances. Our contribution is two-fold: (i) we present a method that learns to represent its environment as a latent graph and leverages state reidentification to reduce the complexity of finding a good policy from exponential to linear (ii) we introduce a set of lightweight environments with an underlying discrete combinatorial structure in which planning is challenging even for humans. Moreover, we show that our methods achieves strong empirical generalization to variations in the environment, even across highly disadvantaged regimes, such as "one-shot" planning, or in an offline RL paradigm which only provides low-quality trajectories.

READ FULL TEXT

page 1

page 7

page 14

page 20

research
02/04/2019

Obstacle Tower: A Generalization Challenge in Vision, Control, and Planning

The rapid pace of research in Deep Reinforcement Learning has been drive...
research
06/08/2022

Deep Hierarchical Planning from Pixels

Intelligent agents need to select long sequences of actions to solve com...
research
11/19/2019

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

Constructing agents with planning capabilities has long been one of the ...
research
01/29/2021

Interleaving Graph Search and Trajectory Optimization for Aggressive Quadrotor Flight

Quadrotors can achieve aggressive flight by tracking complex maneuvers a...
research
06/08/2021

Vector Quantized Models for Planning

Recent developments in the field of model-based RL have proven successfu...
research
02/15/2021

Neuro-algorithmic Policies enable Fast Combinatorial Generalization

Although model-based and model-free approaches to learning the control o...
research
10/12/2022

Efficient Offline Policy Optimization with a Learned Model

MuZero Unplugged presents a promising approach for offline policy learni...

Please sign up or login with your details

Forgot password? Click here to reset