Human-Scene Interaction (HSI) is a vital component of fields like embodi...
Visual segmentation seeks to partition images, video frames, or point cl...
Recent multi-camera 3D object detectors usually leverage temporal inform...
This paper introduces the Masked Voxel Jigsaw and Reconstruction (MV-JAR...
DEtection TRansformer (DETR) started a trend that uses a group of learna...
The goal of video segmentation is to accurately segment and track every ...
One-to-one label assignment in object detection has successfully obviate...
Perceiving 3D objects from monocular inputs is crucial for robotic syste...
End-to-end object detection is rapidly progressed after the emergence of...
This paper presents Video K-Net, a simple, strong, and unified framework...
Multi-Object Tracking (MOT) has rapidly progressed with the development ...
This paper presents Dense Siamese Network (DenseSiam), a simple unsuperv...
Instance recognition is rapidly advanced along with the developments of
...
Unsupervised domain adaptation (UDA) aims to adapt a model of the labele...
Domain adaptation aims to bridge the domain shifts between the source an...
3D object detection is an important capability needed in various practic...
Semantic, instance, and panoptic segmentations have been addressed using...
Monocular 3D object detection is an important task for autonomous drivin...
This report presents the approach used in the submission of the LVIS
Cha...
Similarity metrics for instances have drawn much attention, due to their...
Current object detection frameworks mainly rely on bounding box regressi...
We present MMDetection, an object detection toolbox that contains a rich...
Compared with model architectures, the training process, which is also
c...
Recently, convolutional neural network has brought impressive improvemen...
Cascade is a classic yet powerful architecture that has boosted performa...
The basic principles in designing convolutional neural network (CNN)
str...