In recent years, live streaming platforms have gained immense popularity...
Video grounding aims to locate the timestamps best matching the query
Video understanding is an important task in short video business platfor...
Employing large-scale pre-trained model CLIP to conduct video-text retri...
The task of multi-label image classification is to recognize all the obj...
Since Transformer has found widespread use in NLP, the potential of
As an instance-level recognition problem, re-identification (re-ID) requ...
In recent years, supervised person re-identification (re-ID) models have...