VERAM: View-Enhanced Recurrent Attention Model for 3D Shape Classification

08/20/2018
by   Songle Chen, et al.
0

Multi-view deep neural network is perhaps the most successful approach in 3D shape classification. However, the fusion of multi-view features based on max or average pooling lacks a view selection mechanism, limiting its application in, e.g., multi-view active object recognition by a robot. This paper presents VERAM, a recurrent attention model capable of actively selecting a sequence of views for highly accurate 3D shape classification. VERAM addresses an important issue commonly found in existing attention-based models, i.e., the unbalanced training of the subnetworks corresponding to next view estimation and shape classification. The classification subnetwork is easily overfitted while the view estimation one is usually poorly trained, leading to a suboptimal classification performance. This is surmounted by three essential view-enhancement strategies: 1) enhancing the information flow of gradient backpropagation for the view estimation subnetwork, 2) devising a highly informative reward function for the reinforcement training of view estimation and 3) formulating a novel loss function that explicitly circumvents view duplication. Taking grayscale image as input and AlexNet as CNN architecture, VERAM with 9 views achieves instance-level and class-level accuracy of 95:5 and 95:3 state-of-the-art performance under the same number of views.

READ FULL TEXT

page 6

page 9

page 14

research
10/14/2016

Recurrent 3D Attentional Networks for End-to-End Active Object Recognition in Cluttered Scenes

Active vision is inherently attention-driven: The agent selects views of...
research
11/09/2021

PREMA: Part-based REcurrent Multi-view Aggregation Network for 3D Shape Retrieval

We propose the Part-based Recurrent Multi-view Aggregation network(PREMA...
research
06/04/2019

Dominant Set Clustering and Pooling for Multi-View 3D Object Recognition

View based strategies for 3D object recognition have proven to be very s...
research
11/04/2022

GARNet: Global-Aware Multi-View 3D Reconstruction Network and the Cost-Performance Tradeoff

Deep learning technology has made great progress in multi-view 3D recons...
research
04/01/2019

Equivariant Multi-View Networks

Several approaches to 3D vision tasks process multiple views of the inpu...
research
08/27/2019

HRGE-Net: Hierarchical Relational Graph Embedding Network for Multi-view 3D Shape Recognition

View-based approach that recognizes 3D shape through its projected 2D im...
research
11/06/2018

Stacked Penalized Logistic Regression for Selecting Views in Multi-View Learning

In multi-view learning, features are organized into multiple sets called...

Please sign up or login with your details

Forgot password? Click here to reset