Learning to Navigate Using Mid-Level Visual Priors

12/23/2019
by   Alexander Sax, et al.
6

How much does having visual priors about the world (e.g. the fact that the world is 3D) assist in learning to perform downstream motor tasks (e.g. navigating a complex environment)? What are the consequences of not utilizing such visual priors in learning? We study these questions by integrating a generic perceptual skill set (a distance estimator, an edge detector, etc.) within a reinforcement learning framework (see Fig. 1). This skill set ("mid-level vision") provides the policy with a more processed state of the world compared to raw images. Our large-scale study demonstrates that using mid-level vision results in policies that learn faster, generalize better, and achieve higher final performance, when compared to learning from scratch and/or using state-of-the-art visual and non-visual representation learning methods. We show that conventional computer vision objectives are particularly effective in this regard and can be conveniently integrated into reinforcement learning frameworks. Finally, we found that no single visual representation was universally useful for all downstream tasks, hence we computationally derive a task-agnostic set of representations optimized to support arbitrary downstream tasks.

READ FULL TEXT

page 2

page 5

page 9

page 10

page 14

page 16

page 21

page 22

research
09/30/2022

ASPiRe:Adaptive Skill Priors for Reinforcement Learning

We introduce ASPiRe (Adaptive Skill Prior for RL), a new approach that l...
research
12/31/2018

Mid-Level Visual Representations Improve Generalization and Sample Efficiency for Learning Active Tasks

One of the ultimate promises of computer vision is to help robotic agent...
research
07/12/2021

Representation Learning for Out-Of-Distribution Generalization in Reinforcement Learning

Learning data representations that are useful for various downstream tas...
research
06/16/2023

ALP: Action-Aware Embodied Learning for Perception

Current methods in training and benchmarking vision models exhibit an ov...
research
12/26/2022

Toward Efficient Automated Feature Engineering

Automated Feature Engineering (AFE) refers to automatically generate and...
research
06/21/2021

Lossy Compression for Lossless Prediction

Most data is automatically collected and only ever "seen" by algorithms....
research
11/13/2020

Robust Policies via Mid-Level Visual Representations: An Experimental Study in Manipulation and Navigation

Vision-based robotics often separates the control loop into one module f...

Please sign up or login with your details

Forgot password? Click here to reset