Deep Architectures and Ensembles for Semantic Video Classification

07/03/2018
by   Eng-Jon Ong, et al.
0

This work addresses the problem of accurate semantic labelling of short videos. We advance the state of the art by proposing a new residual architecture, with state-of-the art classification performance at significantly reduced complexity. Further, we propose four new approaches to diversity-driven multi-net ensembling, one based on fast correlation measure and three incorporating a DNN-based combiner. We show that significant performance gains can be achieved by "clever" ensembling of diverse nets and we investigate factors contributing to high diversity. Based on the extensive YouTube8M dataset, we perform a detailed evaluation of a broad range of deep architectures, including designs based on recurrent networks (RNN), feature space aggregation (FV, VLAD, BoW), simple statistical aggregation, mid-stage AV fusion and others, presenting for the first time an in-depth evaluation and analysis of their behaviour.

READ FULL TEXT

page 3

page 7

research
07/13/2017

Cultivating DNN Diversity for Large Scale Video Labelling

We investigate factors controlling DNN diversity in the context of the G...
research
11/21/2016

ResFeats: Residual Network Based Features for Image Classification

Deep residual networks have recently emerged as the state-of-the-art arc...
research
03/29/2022

A Multi-Stage Duplex Fusion ConvNet for Aerial Scene Classification

Existing deep learning based methods effectively prompt the performance ...
research
09/07/2020

ECOC as a Method of Constructing Deep Convolutional Neural Network Ensembles

Deep neural networks have enhanced the performance of decision making sy...
research
11/04/2017

Ensembles of Multiple Models and Architectures for Robust Brain Tumour Segmentation

Deep learning approaches such as convolutional neural nets have consiste...
research
04/08/2015

Evaluating Two-Stream CNN for Video Classification

Videos contain very rich semantic information. Traditional hand-crafted ...
research
07/28/2019

DAR-Net: Dynamic Aggregation Network for Semantic Scene Segmentation

Traditional grid/neighbor-based static pooling has become a constraint f...

Please sign up or login with your details

Forgot password? Click here to reset