This paper introduces the HumTrans dataset, which is publicly available ...
Background music (BGM) can enhance the video's emotion. However, selecti...
Tags are pivotal in facilitating the effective distribution of multimedi...
Both masked image modeling (MIM) and natural language supervision have
f...
Since the development of self-supervised visual representation learning ...
This paper focuses on self-supervised video representation learning. Mos...
In software engineering (SE) tasks, the naming approach is so important ...
A large-scale evaluation for current naming approaches substantiates tha...
Pre-training a model to learn transferable video-text representation for...
Video captioning is a challenging task since it requires generating sent...
This report describes our solution to the VALUE Challenge 2021 in the
ca...
The crux of self-supervised video representation learning is to build ge...