Order-Free RNN with Visual Attention for Multi-Label Classification

07/18/2017
by   Shang-Fu Chen, et al.
0

In this paper, we propose the joint learning attention and recurrent neural network (RNN) models for multi-label classification. While approaches based on the use of either model exist (e.g., for the task of image captioning), training such existing network architectures typically require pre-defined label sequences. For multi-label classification, it would be desirable to have a robust inference process, so that the prediction error would not propagate and thus affect the performance. Our proposed model uniquely integrates attention and Long Short Term Memory (LSTM) models, which not only addresses the above problem but also allows one to identify visual objects of interests with varying sizes without the prior knowledge of particular label ordering. More importantly, label co-occurrence information can be jointly exploited by our LSTM model. Finally, by advancing the technique of beam search, prediction of multiple labels can be efficiently achieved by our proposed network model.

READ FULL TEXT
research
11/22/2019

Orderless Recurrent Models for Multi-label Classification

Recurrent neural networks (RNN) are popular for many computer vision tas...
research
11/16/2016

Semantic Regularisation for Recurrent Image Annotation

The "CNN-RNN" design pattern is increasingly widely applied in a variety...
research
04/11/2019

Adapting RNN Sequence Prediction Model to Multi-label Set Prediction

We present an adaptation of RNN sequence models to the problem of multi-...
research
01/01/2020

Residual Block-based Multi-Label Classification and Localization Network with Integral Regression for Vertebrae Labeling

Accurate identification and localization of the vertebrae in CT scans is...
research
02/18/2018

Structured Label Inference for Visual Understanding

Visual data such as images and videos contain a rich source of structure...
research
11/14/2017

Saliency-based Sequential Image Attention with Multiset Prediction

Humans process visual scenes selectively and sequentially using attentio...
research
09/08/2019

Order-free Learning Alleviating Exposure Bias in Multi-label Classification

Multi-label classification (MLC) assigns multiple labels to each sample....

Please sign up or login with your details

Forgot password? Click here to reset