Unsupervised Learning of Depth and Deep Representation for Visual Odometry from Monocular Videos in a Metric Space

08/04/2019
by   Xiaochuan Yin, et al.
0

For ego-motion estimation, the feature representation of the scenes is crucial. Previous methods indicate that both the low-level and semantic feature-based methods can achieve promising results. Therefore, the incorporation of hierarchical feature representation may benefit from both methods. From this perspective, we propose a novel direct feature odometry framework, named DFO, for depth estimation and hierarchical feature representation learning from monocular videos. By exploiting the metric distance, our framework is able to learn the hierarchical feature representation without supervision. The pose is obtained with a coarse-to-fine approach from high-level to low-level features in enlarged feature maps. The pixel-level attention mask can be self-learned to provide the prior information. In contrast to the previous methods, our proposed method calculates the camera motion with a direct method rather than regressing the ego-motion from the pose network. With this approach, the consistency of the scale factor of translation can be constrained. Additionally, the proposed method is thus compatible with the traditional SLAM pipeline. Experiments on the KITTI dataset demonstrate the effectiveness of our method.

READ FULL TEXT
research
03/02/2020

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

We propose D3VO as a novel framework for monocular visual odometry that ...
research
03/30/2020

DeFeat-Net: General Monocular Depth via Simultaneous Unsupervised Representation Learning

In the current monocular depth research, the dominant approach is to emp...
research
01/14/2023

Dyna-DepthFormer: Multi-frame Transformer for Self-Supervised Depth Estimation in Dynamic Scenes

Self-supervised methods have showed promising results on depth estimatio...
research
07/21/2020

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

Photometric loss is widely used for self-supervised depth and egomotion ...
research
01/22/2019

Unsupervised Learning-based Depth Estimation aided Visual SLAM Approach

The RGB-D camera maintains a limited range for working and is hard to ac...
research
03/16/2022

MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection

Due to the inherent ill-posed nature of 2D-3D projection, monocular 3D o...
research
03/29/2021

Motion Basis Learning for Unsupervised Deep Homography Estimation with Subspace Projection

In this paper, we introduce a new framework for unsupervised deep homogr...

Please sign up or login with your details

Forgot password? Click here to reset