Visual Pre-training for Navigation: What Can We Learn from Noise?

06/30/2022
by   Yanwei Wang, et al.
0

A powerful paradigm for sensorimotor control is to predict actions from observations directly. Training such an end-to-end system allows representations that are useful for the downstream tasks to emerge automatically. In visual navigation, an agent can learn to navigate without any manual designs by correlating how its views change with the actions being taken. However, the lack of inductive bias makes this system data-inefficient and impractical in scenarios like search and rescue, where interacting with the environment to collect data is costly. We hypothesize a sufficient representation of the current view and the goal view for a navigation policy can be learned by predicting the location and size of a crop of the current view that corresponds to the goal. We further show that training such random crop prediction in a self-supervised fashion purely on random noise images transfers well to natural home images. The learned representation can then be bootstrapped to learn a navigation policy efficiently with little interaction data. Code is available at https://github.com/yanweiw/noise2ptz.

READ FULL TEXT

page 1

page 3

page 4

page 7

research
06/17/2022

What do navigation agents learn about their environment?

Today's state of the art visual navigation agents typically consist of l...
research
07/18/2022

ExAgt: Expert-guided Augmentation for Representation Learning of Traffic Scenarios

Representation learning in recent years has been addressed with self-sup...
research
12/09/2022

Object Goal Navigation with End-to-End Self-Supervision

A household robot should be able to navigate to target locations without...
research
01/03/2023

Policy Pre-training for End-to-end Autonomous Driving via Self-supervised Geometric Modeling

Witnessing the impressive achievements of pre-training techniques on lar...
research
06/08/2022

VRChain: A Blockchain-Enabled Framework for Visual Homing and Navigation Robots

Visual homing is a lightweight approach to robot visual navigation. Base...
research
07/19/2021

DeepSocNav: Social Navigation by Imitating Human Behaviors

Current datasets to train social behaviors are usually borrowed from sur...
research
07/15/2021

Neighbor-view Enhanced Model for Vision and Language Navigation

Vision and Language Navigation (VLN) requires an agent to navigate to a ...

Please sign up or login with your details

Forgot password? Click here to reset