Tripping through time: Efficient Localization of Activities in Videos

04/22/2019
by   Meera Hahn, et al.
0

Localizing moments in untrimmed videos via language queries is a new and interesting task that requires the ability to accurately ground language into video. Previous works have approached this task by processing the entire video, often more than once, to localize relevant activities. In the real world applications that this task lends itself to, such as surveillance, efficiency a is pivotal trait of a system. In this paper, we present TripNet, an end-to-end system that uses a gated attention architecture to model fine-grained textual and visual representations in order to align text and video content. Furthermore, TripNet uses reinforcement learning to efficiently localize relevant activity clips in long videos, by learning how to intelligently skip around the video. It extracts visual features for fewer frames to perform activity classification. In our evaluation over Charades-STA, ActivityNet Captions and the TACoS dataset, we find that TripNet achieves high accuracy and saves processing time by only looking at 32-41

READ FULL TEXT

page 1

page 4

page 8

research
06/28/2019

Localizing Unseen Activities in Video via Image Query

Action localization in untrimmed videos is an important topic in the fie...
research
07/30/2019

Temporal Localization of Moments in Video Collections with Natural Language

In this paper, we introduce the task of retrieving relevant video moment...
research
08/28/2019

Out the Window: A Crowd-Sourced Dataset for Activity Classification in Surveillance Video

The Out the Window (OTW) dataset is a crowdsourced activity dataset cont...
research
07/25/2021

Transcript to Video: Efficient Clip Sequencing from Texts

Among numerous videos shared on the web, well-edited ones always attract...
research
03/31/2020

Straight to the Point: Fast-forwarding Videos via Reinforcement Learning Using Textual Data

The rapid increase in the amount of published visual data and the limite...
research
09/01/2018

Activity Recognition on a Large Scale in Short Videos - Moments in Time Dataset

Moments capture a huge part of our lives. Accurate recognition of these ...

Please sign up or login with your details

Forgot password? Click here to reset