TASED-Net: Temporally-Aggregating Spatial Encoder-Decoder Network for Video Saliency Detection

08/15/2019
by   Kyle Min, et al.
5

TASED-Net is a 3D fully-convolutional network architecture for video saliency detection. It consists of two building blocks: first, the encoder network extracts low-resolution spatiotemporal features from an input clip of several consecutive frames, and then the following prediction network decodes the encoded features spatially while aggregating all the temporal information. As a result, a single prediction map is produced from an input clip of multiple frames. Frame-wise saliency maps can be predicted by applying TASED-Net in a sliding-window fashion to a video. The proposed approach assumes that the saliency map of any frame can be predicted by considering a limited number of past frames. The results of our extensive experiments on video saliency detection validate this assumption and demonstrate that our fully-convolutional model with temporal aggregation method is effective. TASED-Net significantly outperforms previous state-of-the-art approaches on all three major large-scale datasets of video saliency detection: DHF1K, Hollywood2, and UCFSports. After analyzing the results qualitatively, we observe that our model is especially better at attending to salient moving objects.

READ FULL TEXT

page 3

page 7

research
05/10/2021

Temporal-Spatial Feature Pyramid for Video Saliency Detection

In this paper, we propose a 3D fully convolutional encoder-decoder archi...
research
01/11/2023

VS-Net: Multiscale Spatiotemporal Features for Lightweight Video Salient Document Detection

Video Salient Document Detection (VSDD) is an essential task of practica...
research
09/21/2018

SG-FCN: A Motion and Memory-Based Deep Learning Model for Video Saliency Detection

Data-driven saliency detection has attracted strong interest as a result...
research
05/20/2019

Are all the frames equally important?

In this work, we address the problem of measuring and predicting tempora...
research
08/04/2017

Video Salient Object Detection Using Spatiotemporal Deep Features

This paper presents a method for detecting salient objects in videos whe...
research
08/04/2017

Region-Based Multiscale Spatiotemporal Saliency for Video

Detecting salient objects from a video requires exploiting both spatial ...
research
07/03/2019

Temporal recurrences for video saliency prediction

This paper investigates modifying an existing neural network architectur...

Please sign up or login with your details

Forgot password? Click here to reset