Weakly-supervised DCNN for RGB-D Object Recognition in Real-World Applications Which Lack Large-scale Annotated Training Data

03/19/2017
by   Li Sun, et al.
0

This paper addresses the problem of RGBD object recognition in real-world applications, where large amounts of annotated training data are typically unavailable. To overcome this problem, we propose a novel, weakly-supervised learning architecture (DCNN-GPC) which combines parametric models (a pair of Deep Convolutional Neural Networks (DCNN) for RGB and D modalities) with non-parametric models (Gaussian Process Classification). Our system is initially trained using a small amount of labeled data, and then automatically prop- agates labels to large-scale unlabeled data. We first run 3D- based objectness detection on RGBD videos to acquire many unlabeled object proposals, and then employ DCNN-GPC to label them. As a result, our multi-modal DCNN can be trained end-to-end using only a small amount of human annotation. Finally, our 3D-based objectness detection and multi-modal DCNN are integrated into a real-time detection and recognition pipeline. In our approach, bounding-box annotations are not required and boundary-aware detection is achieved. We also propose a novel way to pretrain a DCNN for the depth modality, by training on virtual depth images projected from CAD models. We pretrain our multi-modal DCNN on public 3D datasets, achieving performance comparable to state-of-the-art methods on Washington RGBS Dataset. We then finetune the network by further training on a small amount of annotated data from our novel dataset of industrial objects (nuclear waste simulants). Our weakly supervised approach has demonstrated to be highly effective in solving a novel RGBD object recognition application which lacks of human annotations.

READ FULL TEXT

page 1

page 8

research
08/27/2019

Curved Text Detection in Natural Scene Images with Semi- and Weakly-Supervised Learning

Detecting curved text in the wild is very challenging. Recently, most st...
research
05/27/2019

Toward Self-Supervised Object Detection in Unlabeled Videos

Unlabeled video in the wild presents a valuable, yet so far unharnessed,...
research
04/23/2021

Co-training for Deep Object Detection: Comparing Single-modal and Multi-modal Approaches

Top-performing computer vision models are powered by convolutional neura...
research
04/13/2021

SPARK: SPAcecraft Recognition leveraging Knowledge of Space Environment

This paper proposes the SPARK dataset as a new unique space object multi...
research
10/29/2019

Model enhancement and personalization using weakly supervised learning for multi-modal mobile sensing

Always-on sensing of mobile device user's contextual information is crit...
research
05/06/2023

Mixer: Image to Multi-Modal Retrieval Learning for Industrial Application

Cross-modal retrieval, where the query is an image and the doc is an ite...
research
08/30/2021

Noisy Labels for Weakly Supervised Gamma Hadron Classification

Gamma hadron classification, a central machine learning task in gamma ra...

Please sign up or login with your details

Forgot password? Click here to reset