CubifAE-3D: Monocular Camera Space Cubification on Autonomous Vehicles for Auto-Encoder based 3D Object Detection

by   Shubham Shrivastava, et al.

We introduce a method for 3D object detection using a single monocular image. Depth data is used to pre-train an RGB-to-Depth Auto-Encoder (AE). The embedding learnt from this AE is then used to train a 3D Object Detector (3DOD) CNN which is used to regress the parameters of 3D object poses after the encoder from the AE generates a latent embedding from the RGB image. We show that we can pre-train the AE using paired RGB and depth images from simulation data once and subsequently only train the 3DOD network using real data, comprising of RGB images and 3D object pose labels (without the requirement of dense depth). Our 3DOD network utilizes a particular cubification of 3D space around the camera, where each cuboid is tasked with predicting N object poses, along with their class and confidence values. The AE pre-training and this method of dividing the 3D space around the camera into cuboids give our method its name - CubifAE-3D. We demonstrate results for monocular 3D object detection on the Virtual KITTI 2, KITTI, and nuScenes datasets for Autnomous Vehicle (AV) perception.


page 1

page 4

page 6

page 8

page 9

page 10

page 11


Single-Shot 3D Detection of Vehicles from Monocular RGB Images via Geometry Constrained Keypoints in Real-Time

In this paper we propose a novel 3D single-shot object detection method ...

Self-supervised 3D Object Detection from Monocular Pseudo-LiDAR

There have been attempts to detect 3D objects by fusion of stereo camera...

Delving into the Pre-training Paradigm of Monocular 3D Object Detection

The labels of monocular 3D object detection (M3OD) are expensive to obta...

Generate What You Can't See - a View-dependent Image Generation

In order to operate autonomously, a robot should explore the environment...

VR3Dense: Voxel Representation Learning for 3D Object Detection and Monocular Dense Depth Reconstruction

3D object detection and dense depth estimation are one of the most vital...

MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation

Monocular 3D object detection has recently shown promising results, howe...

Learning Rich Features from RGB-D Images for Object Detection and Segmentation

In this paper we study the problem of object detection for RGB-D images ...

Please sign up or login with your details

Forgot password? Click here to reset