Towards Scalable Neural Representation for Diverse Videos

by   Bo He, et al.

Implicit neural representations (INR) have gained increasing attention in representing 3D scenes and images, and have been recently applied to encode videos (e.g., NeRV, E-NeRV). While achieving promising results, existing INR-based methods are limited to encoding a handful of short videos (e.g., seven 5-second videos in the UVG dataset) with redundant visual content, leading to a model design that fits individual video frames independently and is not efficiently scalable to a large number of diverse videos. This paper focuses on developing neural representations for a more practical setup – encoding long and/or a large number of videos with diverse visual content. We first show that instead of dividing videos into small subsets and encoding them with separate models, encoding long and diverse videos jointly with a unified model achieves better compression results. Based on this observation, we propose D-NeRV, a novel neural representation framework designed to encode diverse videos by (i) decoupling clip-specific visual content from motion information, (ii) introducing temporal reasoning into the implicit neural network, and (iii) employing the task-oriented flow as intermediate output to reduce spatial redundancies. Our new model largely surpasses NeRV and traditional video compression techniques on UCF101 and UVG datasets on the video compression task. Moreover, when used as an efficient data-loader, D-NeRV achieves 3 UCF101 dataset under the same compression ratios.


page 1

page 8

page 13

page 14


NeRV: Neural Representations for Videos

We propose a novel neural representation for videos (NeRV) which encodes...

DNeRV: Modeling Inherent Dynamics via Difference Neural Representation for Videos

Existing implicit neural representation (INR) methods do not fully explo...

Progressive Neural Representation for Sequential Video Compilation

Neural Implicit Representations (NIR) have gained significant attention ...

CNeRV: Content-adaptive Neural Representation for Visual Data

Compression and reconstruction of visual data have been widely studied i...

SCI: A spectrum concentrated implicit neural compression for biomedical data

Massive collection and explosive growth of the huge amount of medical da...

Compressed Vision for Efficient Video Understanding

Experience and reasoning occur across multiple temporal scales: millisec...

Neural Implicit Representations for Physical Parameter Inference from a Single Video

Neural networks have recently been used to analyze diverse physical syst...

Please sign up or login with your details

Forgot password? Click here to reset