A Reinforcement-Learning-Based Energy-Efficient Framework for Multi-Task Video Analytics Pipeline

by   Yingying Zhao, et al.

Deep-learning-based video processing has yielded transformative results in recent years. However, the video analytics pipeline is energy-intensive due to high data rates and reliance on complex inference algorithms, which limits its adoption in energy-constrained applications. Motivated by the observation of high and variable spatial redundancy and temporal dynamics in video data streams, we design and evaluate an adaptive-resolution optimization framework to minimize the energy use of multi-task video analytics pipelines. Instead of heuristically tuning the input data resolution of individual tasks, our framework utilizes deep reinforcement learning to dynamically govern the input resolution and computation of the entire video analytics pipeline. By monitoring the impact of varying resolution on the quality of high-dimensional video analytics features, hence the accuracy of video analytics results, the proposed end-to-end optimization framework learns the best non-myopic policy for dynamically controlling the resolution of input video streams to achieve globally optimize energy efficiency. Governed by reinforcement learning, optical flow is incorporated into the framework to minimize unnecessary spatio-temporal redundancy that leads to re-computation, while preserving accuracy. The proposed framework is applied to video instance segmentation which is one of the most challenging machine vision tasks, and the energy consumption efficiency of the proposed framework has significantly surpassed all baseline methods of similar accuracy on the YouTube-VIS dataset.


page 1

page 2

page 3

page 6

page 8

page 9

page 12

page 13


Energy-Efficient Parking Analytics System using Deep Reinforcement Learning

Advances in deep vision techniques and ubiquity of smart cameras will dr...

Learn to Compress (LtC): Efficient Learning-based Streaming Video Analytics

Video analytics are often performed as cloud services in edge settings, ...

APT: Adaptive Perceptual quality based camera Tuning using reinforcement learning

Cameras are increasingly being deployed in cities, enterprises and roads...

HSTR-Net: High Spatio-Temporal Resolution Video Generation For Wide Area Surveillance

Wide area surveillance has many applications and tracking of objects und...

Task-Oriented Communication for Edge Video Analytics

With the development of artificial intelligence (AI) techniques and the ...

AdaFocus V2: End-to-End Training of Spatial Dynamic Networks for Video Recognition

Recent works have shown that the computational efficiency of video recog...

Elixir: A system to enhance data quality for multiple analytics on a video stream

IoT sensors, especially video cameras, are ubiquitously deployed around ...

Please sign up or login with your details

Forgot password? Click here to reset