SNAP: Self-Supervised Neural Maps for Visual Positioning and Semantic Understanding

06/08/2023
by   Paul-Edouard Sarlin, et al.
0

Semantic 2D maps are commonly used by humans and machines for navigation purposes, whether it's walking or driving. However, these maps have limitations: they lack detail, often contain inaccuracies, and are difficult to create and maintain, especially in an automated fashion. Can we use raw imagery to automatically create better maps that can be easily interpreted by both humans and machines? We introduce SNAP, a deep network that learns rich neural 2D maps from ground-level and overhead images. We train our model to align neural maps estimated from different inputs, supervised only with camera poses over tens of millions of StreetView images. SNAP can resolve the location of challenging image queries beyond the reach of traditional methods, outperforming the state of the art in localization by a large margin. Moreover, our neural maps encode not only geometry and appearance but also high-level semantics, discovered without explicit supervision. This enables effective pre-training for data-efficient semantic scene understanding, with the potential to unlock cost-efficient creation of more detailed maps.

READ FULL TEXT

page 6

page 12

page 14

page 17

page 18

page 21

page 22

page 23

research
07/23/2023

Learning Navigational Visual Representations with Semantic Map Supervision

Being able to perceive the semantics and the spatial structure of the en...
research
04/04/2023

OrienterNet: Visual Localization in 2D Public Maps with Neural Matching

Humans can orient themselves in their 3D environments using simple 2D ma...
research
02/08/2023

SkyEye: Self-Supervised Bird's-Eye-View Semantic Mapping Using Monocular Frontal View Images

Bird's-Eye-View (BEV) semantic maps have become an essential component o...
research
07/20/2020

Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation

We introduce a learning-based approach for room navigation using semanti...
research
12/01/2022

A General Purpose Supervisory Signal for Embodied Agents

Training effective embodied AI agents often involves manual reward engin...
research
12/14/2022

ECON: Explicit Clothed humans Optimized via Normal integration

The combination of deep learning, artist-curated scans, and Implicit Fun...
research
05/10/2018

Fighting Fake News: Image Splice Detection via Learned Self-Consistency

Advances in photo editing and manipulation tools have made it significan...

Please sign up or login with your details

Forgot password? Click here to reset