Errui Ding

research

∙ 09/01/2023

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation

In this paper, we present VideoGen, a text-to-video generation approach,...

0 Xin Li, et al. ∙

research

∙ 07/30/2023

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation

In this paper, we study Text-to-3D content generation leveraging 2D diff...

0 Jinbo Wu, et al. ∙

research

∙ 07/16/2023

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers

We analyze the DETR-based framework on semi-supervised object detection ...

1 Jiacheng Zhang, et al. ∙

research

∙ 06/29/2023

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation

One of the mainstream schemes for 2D human pose estimation (HPE) is lear...

0 Zhongwei Qiu, et al. ∙

research

∙ 06/05/2023

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images

Structured text extraction is one of the most valuable and challenging a...

0 Wenwen Yu, et al. ∙

research

∙ 05/22/2023

Building an Invisible Shield for Your Portrait against Deepfakes

The issue of detecting deepfakes has garnered significant attention in t...

0 Jiazhi Guan, et al. ∙

research

∙ 05/12/2023

Multi-Modal 3D Object Detection by Box Matching

Multi-modal 3D object detection has received growing attention as the in...

0 Zhe Liu, et al. ∙

research

∙ 05/09/2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

Despite recent advances in syncing lip movements with any audio waves, c...

0 Jiazhi Guan, et al. ∙

research

∙ 03/27/2023

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box

Multi-object tracking (MOT) aims at estimating bounding boxes and identi...

0 Yifu Zhang, et al. ∙

research

∙ 03/27/2023

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection

With basic Semi-Supervised Object Detection (SSOD) techniques, one-stage...

0 Chang Liu, et al. ∙

research

∙ 03/17/2023

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection

In this paper, we address the problem of detecting 3D objects from multi...

0 Kaixin Xiong, et al. ∙

research

∙ 03/16/2023

PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers

Existing methods of multi-person video 3D human Pose and Shape Estimatio...

0 Zhongwei Qiu, et al. ∙

research

∙ 03/09/2023

LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution

It is widely agreed that reference-based super-resolution (RefSR) achiev...

0 Lin Zhang, et al. ∙

research

∙ 03/03/2023

Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement

Neural Radiance Fields (NeRF) have constituted a remarkable breakthrough...

0 Jiaxiang Tang, et al. ∙

research

∙ 03/01/2023

StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training

In this paper, we present StrucTexTv2, an effective document image pre-t...

0 Yuechen Yu, et al. ∙

research

∙ 02/14/2023

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation

Creating the photo-realistic version of people sketched portraits is use...

0 Yasheng Sun, et al. ∙

research

∙ 01/26/2023

Graph Contrastive Learning for Skeleton-based Action Recognition

In the field of skeleton-based action recognition, current top-performin...

0 Xiaohu Huang, et al. ∙

research

∙ 01/04/2023

StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-based 3D Object Detection

In this paper, we propose a cross-modal distillation method named Stereo...

0 Zhe Liu, et al. ∙

research

∙ 12/09/2022

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers

Previous studies have explored generating accurately lip-synced talking ...

0 Yasheng Sun, et al. ∙

research

∙ 12/07/2022

Cyclically Disentangled Feature Translation for Face Anti-spoofing

Current domain adaptation methods for face anti-spoofing leverage labele...

0 Haixiao Yue, et al. ∙

research

∙ 11/17/2022

CAE v2: Context Autoencoder with CLIP Target

Masked image modeling (MIM) learns visual representation by masking and ...

0 Xinyu Zhang, et al. ∙

research

∙ 11/15/2022

Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling

DETR is a novel end-to-end transformer architecture object detector, whi...

0 Yu Wang, et al. ∙

research

∙ 11/07/2022

Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining

We present a strong object detector with encoder-decoder pretraining and...

0 Qiang Chen, et al. ∙

research

∙ 10/13/2022

U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction

High resolution and advanced semantic representation are both vital for ...

0 Jian Wang, et al. ∙

research

∙ 10/13/2022

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer

Recently, transformer-based networks have shown impressive results in se...

0 Jian Wang, et al. ∙

research

∙ 10/11/2022

Repainting and Imitating Learning for Lane Detection

Current lane detection methods are struggling with the invisibility lane...

0 Yue He, et al. ∙

research

∙ 08/31/2022

MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition

Vision Transformer and its variants have demonstrated great potential in...

0 Yunhao Wang, et al. ∙

research

∙ 08/24/2022

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network

The human brain can effortlessly recognize and localize objects, whereas...

0 Liang Du, et al. ∙

research

∙ 08/19/2022

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition

In this paper, we study the problem of one-shot skeleton-based action re...

11 Tailin Chen, et al. ∙

research

∙ 08/08/2022

Boosting Video-Text Retrieval with Explicit High-Level Semantics

Video-text retrieval (VTR) is an attractive yet challenging task for mul...

0 Haoran Wang, et al. ∙

research

∙ 07/21/2022

Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption

Despite encouraging progress in deepfake detection, generalization to un...

11 Jiazhi Guan, et al. ∙

research

∙ 07/21/2022

UFO: Unified Feature Optimization

This paper proposes a novel Unified Feature Optimization (UFO) paradigm ...

0 Teng Xi, et al. ∙

research

∙ 07/17/2022

Neural Color Operators for Sequential Image Retouching

We propose a novel image retouching method by modeling the retouching pr...

0 Yili Wang, et al. ∙

research

∙ 07/12/2022

Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network

3D object detection task from lidar or camera sensors is essential for a...

0 Bo Ju, et al. ∙

research

∙ 07/06/2022

Delving into Sequential Patches for Deepfake Detection

Recent advances in face forgery techniques produce nearly visually untra...

6 Jiazhi Guan, et al. ∙

research

∙ 06/15/2022

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis

Recently, Neural Radiance Fields (NeRF) is revolutionizing the task of n...

0 Xiang Guo, et al. ∙

research

∙ 06/13/2022

Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning

Freezing the pre-trained backbone has become a standard paradigm to avoi...

10 Yanpeng Sun, et al. ∙

research

∙ 06/01/2022

MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining

In this paper, we present a model pretraining technique, named MaskOCR, ...

0 Pengyuan Lyu, et al. ∙

research

∙ 04/20/2022

Human-Object Interaction Detection via Disentangled Transformer

Human-Object Interaction Detection tackles the problem of joint localiza...

0 Desen Zhou, et al. ∙

research

∙ 04/16/2022

GitNet: Geometric Prior-based Transformation for Birds-Eye-View Segmentation

Birds-eye-view (BEV) semantic segmentation is critical for autonomous dr...

0 Shi Gong, et al. ∙

research

∙ 04/14/2022

Implicit Sample Extension for Unsupervised Person Re-Identification

Most existing unsupervised person re-identification (Re-ID) methods use ...

0 Xinyu Zhang, et al. ∙

research

∙ 04/06/2022

MixFormer: Mixing Features across Windows and Dimensions

While local-window self-attention performs notably in vision tasks, it s...

9 Qiang Chen, et al. ∙

research

∙ 03/31/2022

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval

Visual appearance is considered to be the most important cue to understa...

0 Mengjun Cheng, et al. ∙

research

∙ 03/25/2022

Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task

Concurrent perception datasets for autonomous driving are mainly limited...

0 Xiaoqing Ye, et al. ∙

research

∙ 03/05/2022

Adversarial Dual-Student with Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation

A common challenge posed to robust semantic segmentation is the expensiv...

0 Cong Cao, et al. ∙

research

∙ 01/11/2022

MobileFaceSwap: A Lightweight Framework for Video Face Swapping

Advanced face swapping methods have achieved appealing results. However,...

10 Zhiliang Xu, et al. ∙

research

∙ 12/28/2021

The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection

Low-cost monocular 3D object detection plays a fundamental role in auton...

12 Zhikang Zou, et al. ∙

research

∙ 12/03/2021

SGM3D: Stereo Guided Monocular 3D Object Detection

Monocular 3D object detection is a critical yet challenging task for aut...

3 Zheyuan Zhou, et al. ∙

research

∙ 11/26/2021

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model

To achieve disentangled image manipulation, previous works depend heavil...

0 Zipeng Xu, et al. ∙

research

∙ 08/19/2021

An Information Theory-inspired Strategy for Automatic Network Pruning

Despite superior performance on many computer vision tasks, deep convolu...

41 Xiawu Zheng, et al. ∙

Errui Ding

Featured Co-authors

Sign in with Google

Consider DeepAI Pro