Audio-driven portrait animation aims to synthesize portrait videos that ...
Vision-language alignment learning for video-text retrieval arouses a lo...
Despite that convolution neural networks (CNN) have recently demonstrate...
In recent years, Convolutional Neural Network (CNN) based trackers have
...
Video object detection is a tough task due to the deteriorated quality o...