Large language models have become a potential pathway toward achieving
a...
Recent self-supervised methods are mainly designed for representation
le...
The task of 3D semantic scene graph (3DSSG) prediction in the point clou...
Recently, perception task based on Bird's-Eye View (BEV) representation ...
Reconstructing a 3D shape based on a single sketch image is challenging ...
In computer vision, pre-training models based on largescale supervised
l...
Large-scale datasets play a vital role in computer vision. Existing data...
The rapid progress of photorealistic synthesis techniques has reached a
...
3D human mesh recovery from point clouds is essential for various tasks,...
3D object detection in point clouds is a challenging vision task that
be...
In this work, we propose a novel deep learning framework that can genera...
The rapid progress of photorealistic synthesis techniques has reached at...
Recently, deep learning has been utilized to solve video recognition pro...
Several variants of stochastic gradient descent (SGD) have been proposed...
As realistic facial manipulation technologies have achieved remarkable
p...
Given an existing system learned from previous source domains, it is
des...
One-shot NAS method has attracted much interest from the research commun...
3D point cloud completion, the task of inferring the complete geometric ...
Pedestrian attribute recognition has been an emerging research topic in ...
Text-image cross-modal retrieval is a challenging task in the field of
l...
In this paper, we propose a generative framework that unifies depth-base...
Synthesizing photo-realistic images from text descriptions is a challeng...
Dense captioning aims at simultaneously localizing semantic regions and
...
We present an efficient 3D object detection framework based on a single ...
This paper proposes the novel task of video generation conditioned on a
...
Imagining multiple consecutive frames given one single snapshot is
chall...
Multi-label image classification is a fundamental but challenging task
t...
Recognizing visual relationships <subject-predicate-object> among any pa...
Zero-shot artistic style transfer is an important image synthesis proble...
This paper proposes learning disentangled but complementary face feature...
Motion representation plays a vital role in human action recognition in
...
Pedestrian analysis plays a vital role in intelligent video surveillance...