Xize Cheng | DeepAI

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Zhou Zhao
125 publications
fcq
109 publications
Yi Ren
101 publications
Ye Wang
89 publications
Xiang Yin
36 publications
Jinglin Liu
34 publications
Li Tang
25 publications
Rongjie Huang
25 publications
Gang Sun
17 publications
Zhenhui Ye
17 publications
Yichen Zhu
16 publications

research

∙ 07/25/2023

3DRP-Net: 3D Relative Position-aware Network for 3D Visual Grounding

3D visual grounding aims to localize the target object in a 3D point clo...

0 Zehan Wang, et al. ∙

research

∙ 07/18/2023

Distilling Coarse-to-Fine Semantic Matching Knowledge for Weakly Supervised 3D Visual Grounding

3D visual grounding involves finding a target object in a 3D scene that ...

0 Zehan Wang, et al. ∙

research

∙ 06/10/2023

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

Speech Recognition builds a bridge between the multimedia streaming (aud...

0 Xize Cheng, et al. ∙

research

∙ 05/24/2023

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Direct speech-to-speech translation (S2ST) aims to convert speech from o...

0 Rongjie Huang, et al. ∙

research

∙ 05/22/2023

Connecting Multi-modal Contrastive Representations

Multi-modal Contrastive Representation (MCR) learning aims to encode dif...

0 Zehan Wang, et al. ∙

research

∙ 05/21/2023

Wav2SQL: Direct Generalizable Speech-To-SQL Parsing

Speech-to-SQL (S2SQL) aims to convert spoken questions into SQL queries ...

0 Huadai Liu, et al. ∙

research

∙ 03/09/2023

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Multi-media communications facilitate global interaction among people. H...

0 Xize Cheng, et al. ∙

research

∙ 11/21/2022

Diffusion Denoising Process for Perceptron Bias in Out-of-distribution Detection

Out-of-distribution (OOD) detection is an important task to ensure the r...

0 Luping Liu, et al. ∙

Success!

An error occurred