Temporal Tessellation: A Unified Approach for Video Analysis

by   Dotan Kaufman, et al.

We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics -- natural language captions or other labels -- depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC'16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks.


page 1

page 5

page 6

page 7

page 8


A Video Summarization Method Using Temporal Interest Detection and Key Frame Prediction

In this paper, a Video Summarization Method using Temporal Interest Dete...

Zero-Shot Dense Video Captioning by Jointly Optimizing Text and Moment

Dense video captioning, a task of localizing meaningful moments and gene...

Perception Test: A Diagnostic Benchmark for Multimodal Video Models

We propose a novel multimodal video benchmark - the Perception Test - to...

Temporal Consistent Automatic Video Colorization via Semantic Correspondence

Video colorization task has recently attracted wide attention. Recent me...

VideoMCC: a New Benchmark for Video Comprehension

While there is overall agreement that future technology for organizing, ...

Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration

Our goal is for a robot to execute a previously unseen task based on a s...

FraCaS: Temporal Analysis

In this paper, we propose an implementation of temporal semantics which ...

Please sign up or login with your details

Forgot password? Click here to reset