SA-CNN: Dynamic Scene Classification using Convolutional Neural Networks

by   Aalok Gangopadhyay, et al.

The task of classifying videos of natural dynamic scenes into appropriate classes has gained lot of attention in recent years. The problem especially becomes challenging when the camera used to capture the video is dynamic. In this paper, we analyse the performance of statistical aggregation (SA) techniques on various pre-trained convolutional neural network(CNN) models to address this problem. The proposed approach works by extracting CNN activation features for a number of frames in a video and then uses an aggregation scheme in order to obtain a robust feature descriptor for the video. We show through results that the proposed approach performs better than the-state-of-the arts for the Maryland and YUPenn dataset. The final descriptor obtained is powerful enough to distinguish among dynamic scenes and is even capable of addressing the scenario where the camera motion is dominant and the scene dynamics are complex. Further, this paper shows an extensive study on the performance of various aggregation methods and their combinations. We compare the proposed approach with other dynamic scene classification algorithms on two publicly available datasets - Maryland and YUPenn to demonstrate the superior performance of the proposed approach.


page 3

page 7

page 15

page 20


Dynamic texture and scene classification by transferring deep image features

Dynamic texture and scene classification are two fundamental problems in...

Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network

We propose a new deep learning based approach for camera relocalization....

Segmentation and Recovery of Superquadric Models using Convolutional Neural Networks

In this paper we address the problem of representing 3D visual data with...

Acoustic Scene Classification Using Fusion of Attentive Convolutional Neural Networks for DCASE2019 Challenge

In this report, the Brno University of Technology (BUT) team submissions...

Accurate and efficient video de-fencing using convolutional neural networks and temporal information

De-fencing is to eliminate the captured fence on an image or a video, pr...

Turning an Urban Scene Video into a Cinemagraph

This paper proposes an algorithm that turns a regular video capturing ur...

Selfie Detection by Synergy-Constraint Based Convolutional Neural Network

Categorisation of huge amount of data on the multimedia platform is a cr...

Please sign up or login with your details

Forgot password? Click here to reset