Following High-level Navigation Instructions on a Simulated Quadcopter with Imitation Learning

05/31/2018
by   Valts Blukis, et al.
0

We introduce a method for following high-level navigation instructions by mapping directly from images, instructions and pose estimates to continuous low-level velocity commands for real-time control. The Grounded Semantic Mapping Network (GSMN) is a fully-differentiable neural network architecture that builds an explicit semantic map in the world reference frame by incorporating a pinhole camera projection model within the network. The information stored in the map is learned from experience, while the local-to-world transformation is computed explicitly. We train the model using DAggerFM, a modified variant of DAgger that trades tabular convergence guarantees for improved training speed and memory use. We test GSMN in virtual environments on a realistic quadcopter simulator and show that incorporating an explicit mapping and grounding modules allows GSMN to outperform strong neural baselines and almost reach an expert policy performance. Finally, we analyze the learned map representations and show that using an explicit map leads to an interpretable instruction-following model.

READ FULL TEXT

page 1

page 3

page 6

page 8

research
10/12/2021

FILM: Following Instructions in Language with Modular Methods

Recent methods for embodied instruction following are typically trained ...
research
03/10/2021

ELLA: Exploration through Learned Language Abstraction

Building agents capable of understanding language instructions is critic...
research
11/14/2020

Few-shot Object Grounding and Mapping for Natural Language Robot Instruction Following

We study the problem of learning a robot policy to follow natural langua...
research
11/10/2018

Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction

We propose an approach for mapping natural language instructions and raw...
research
07/03/2019

Chasing Ghosts: Instruction Following as Bayesian State Tracking

A visually-grounded navigation instruction can be interpreted as a seque...
research
04/05/2023

ENTL: Embodied Navigation Trajectory Learner

We propose Embodied Navigation Trajectory Learner (ENTL), a method for e...
research
02/15/2021

End-to-End Egospheric Spatial Memory

Spatial memory, or the ability to remember and recall specific locations...

Please sign up or login with your details

Forgot password? Click here to reset