SampleAhead: Online Classifier-Sampler Communication for Learning from Synthesized Data

04/01/2018
by   Qi Chen, et al.
0

State-of-the-art techniques of artificial intelligence, in particular deep learning, are mostly data-driven. However, collecting and manually labeling a large scale dataset is both difficult and expensive. A promising alternative is to introduce synthesized training data, so that the dataset size can be significantly enlarged with little human labor. But, this raises an important problem in active vision: given an infinite data space, how to effectively sample a finite subset to train a visual classifier? This paper presents an approach for learning from synthesized data effectively. The motivation is straightforward -- increasing the probability of seeing difficult training data. We introduce a module named SampleAhead to formulate the learning process into an online communication between a classifier and a sampler, and update them iteratively. In each round, we adjust the sampling distribution according to the classification results, and train the classifier using the data sampled from the updated distribution. Experiments are performed by introducing synthesized images rendered from ShapeNet models to assist PASCAL3D+ classification. Our approach enjoys higher classification accuracy, especially in the scenario of a limited number of training samples. This demonstrates its efficiency in exploring the infinite data space.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/26/2021

Synthesize-It-Classifier: Learning a Generative Classifier through RecurrentSelf-analysis

In this work, we show the generative capability of an image classifier n...
research
06/24/2020

Minimum Cost Active Labeling

Labeling a data set completely is important for groundtruth generation. ...
research
12/12/2015

Active Sampler: Light-weight Accelerator for Complex Data Analytics at Scale

Recent years have witnessed amazing outcomes from "Big Models" trained b...
research
02/24/2023

FedDBL: Communication and Data Efficient Federated Deep-Broad Learning for Histopathological Tissue Classification

Histopathological tissue classification is a fundamental task in computa...
research
08/21/2023

Overcoming Overconfidence for Active Learning

It is not an exaggeration to say that the recent progress in artificial ...
research
11/09/2017

Material Classification in the Wild: Do Synthesized Training Data Generalise Better than Real-World Training Data?

We question the dominant role of real-world training images in the field...
research
06/01/2023

Efficient Failure Pattern Identification of Predictive Algorithms

Given a (machine learning) classifier and a collection of unlabeled data...

Please sign up or login with your details

Forgot password? Click here to reset