Recognition of Visually Perceived Compositional Human Actions by Multiple Spatio-Temporal Scales Recurrent Neural Networks

by   Haanvid Lee, et al.

The current paper proposes a novel neural network model for recognizing visually perceived human actions. The proposed multiple spatio-temporal scales recurrent neural network (MSTRNN) model is derived by introducing multiple timescale recurrent dynamics to the conventional convolutional neural network model. One of the essential characteristics of the MSTRNN is that its architecture imposes both spatial and temporal constraints simultaneously on the neural activity which vary in multiple scales among different layers. As suggested by the principle of the upward and downward causation, it is assumed that the network can develop meaningful structures such as functional hierarchy by taking advantage of such constraints during the course of learning. To evaluate the characteristics of the model, the current study uses three types of human action video dataset consisting of different types of primitive actions and different levels of compositionality on them. The performance of the MSTRNN in testing with these dataset is compared with the ones by other representative deep learning models used in the field. The analysis of the internal representation obtained through the learning with the dataset clarifies what sorts of functional hierarchy can be developed by extracting the essential compositionality underlying the dataset.


page 1

page 3

page 6

page 10


Predictive Coding for Dynamic Vision : Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model

The current paper presents a novel recurrent neural network model, the p...

Predictive Coding for Dynamic Visual Processing: Development of Functional Hierarchy in a Multiple Spatio-Temporal Scales RNN Model

The current paper proposes a novel predictive coding type neural network...

Predicting resonant properties of plasmonic structures by deep learning

Deep learning can be used to extract meaningful results from images. In ...

Structural Recurrent Neural Network (SRNN) for Group Activity Analysis

A group of persons can be analyzed at various semantic levels such as in...

Ensemble perspective for understanding temporal credit assignment

Recurrent neural networks are widely used for modeling spatio-temporal s...

Epileptic Seizure Classification with Symmetric and Hybrid Bilinear Models

Epilepsy affects nearly 1 be treated by anti-epileptic drugs and a much ...

Please sign up or login with your details

Forgot password? Click here to reset