LP-3DCNN: Unveiling Local Phase in 3D Convolutional Neural Networks

04/06/2019
by   Sudhakar Kumawat, et al.
0

Traditional 3D Convolutional Neural Networks (CNNs) are computationally expensive, memory intensive, prone to overfit, and most importantly, there is a need to improve their feature learning capabilities. To address these issues, we propose Rectified Local Phase Volume (ReLPV) block, an efficient alternative to the standard 3D convolutional layer. The ReLPV block extracts the phase in a 3D local neighborhood (e.g., 3x3x3) of each position of the input map to obtain the feature maps. The phase is extracted by computing 3D Short Term Fourier Transform (STFT) at multiple fixed low frequency points in the 3D local neighborhood of each position. These feature maps at different frequency points are then linearly combined after passing them through an activation function. The ReLPV block provides significant parameter savings of at least, 3^3 to 13^3 times compared to the standard 3D convolutional layer with the filter sizes 3x3x3 to 13x13x13, respectively. We show that the feature learning capabilities of the ReLPV block are significantly better than the standard 3D convolutional layer. Furthermore, it produces consistently better results across different 3D data representations. We achieve state-of-the-art accuracy on the volumetric ModelNet10 and ModelNet40 datasets while utilizing only 11 current state-of-the-art. We also improve the state-of-the-art on the UCF-101 split-1 action recognition dataset by 5.68 using only 15 is available at https://sites.google.com/view/lp-3dcnn/home.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/22/2020

Depthwise Spatio-Temporal STFT Convolutional Neural Networks for Human Action Recognition

Conventional 3D convolutional neural networks (CNNs) are computationally...
research
01/27/2020

Depthwise-STFT based separable Convolutional Neural Networks

In this paper, we propose a new convolutional layer called Depthwise-STF...
research
03/09/2020

On the Texture Bias for Few-Shot CNN Segmentation

Despite the initial belief that Convolutional Neural Networks (CNNs) are...
research
04/23/2018

N-fold Superposition: Improving Neural Networks by Reducing the Noise in Feature Maps

Considering the use of Fully Connected (FC) layer limits the performance...
research
06/14/2020

PatchUp: A Regularization Technique for Convolutional Neural Networks

Large capacity deep learning models are often prone to a high generaliza...
research
10/17/2022

Defects of Convolutional Decoder Networks in Frequency Representation

In this paper, we prove representation bottlenecks of a cascaded convolu...
research
07/13/2021

Combining 3D Image and Tabular Data via the Dynamic Affine Feature Map Transform

Prior work on diagnosing Alzheimer's disease from magnetic resonance ima...

Please sign up or login with your details

Forgot password? Click here to reset