Xi Yin

research

∙ 06/23/2023

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems

Current ASR systems are mainly trained and evaluated at the utterance le...

0 Mingyu Cui, et al. ∙

research

∙ 03/29/2023

MaLP: Manipulation Localization Using a Proactive Scheme

Advancements in the generation quality of various Generative Models (GMs...

0 Vishal Asnani, et al. ∙

research

∙ 11/25/2022

SpaText: Spatio-Textual Representation for Controllable Image Generation

Recent text-to-image diffusion models are able to generate convincing re...

0 Omri Avrahami, et al. ∙

research

∙ 04/07/2022

Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer

Videos are created to express emotion, exchange information, and share e...

3 Songwei Ge, et al. ∙

research

∙ 03/29/2022

Proactive Image Manipulation Detection

Image manipulation detection algorithms are often trained to discriminat...

1 Vishal Asnani, et al. ∙

research

∙ 06/15/2021

Reverse Engineering of Generative Models: Inferring Model Hyperparameters from Generated Images

State-of-the-art (SOTA) Generative Models (GMs) can synthesize photo-rea...

7 Vishal Asnani, et al. ∙

research

∙ 03/29/2021

A Multiplexed Network for End-to-End, Multilingual OCR

Recent advances in OCR have shown that an end-to-end (E2E) training pipe...

0 Jing Huang, et al. ∙

research

∙ 03/16/2021

KGSynNet: A Novel Entity Synonyms Discovery Framework with Knowledge Graph

Entity synonyms discovery is crucial for entity-leveraging applications....

0 Yiying Yang, et al. ∙

research

∙ 12/14/2020

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

We propose real-time, six degrees of freedom (6DoF), 3D face pose estima...

5 Vitor Albiero, et al. ∙

research

∙ 12/08/2020

TAP: Text-Aware Pre-training for Text-VQA and Text-Caption

In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and...

0 Zhengyuan Yang, et al. ∙

research

∙ 09/28/2020

VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training

It is highly desirable yet challenging to generate image captions that c...

1 Xiaowei Hu, et al. ∙

research

∙ 05/22/2020

Hashing-based Non-Maximum Suppression for Crowded Object Detection

In this paper, we propose an algorithm, named hashing-based non-maximum ...

1 Jianfeng Wang, et al. ∙

research

∙ 04/13/2020

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

Large-scale pre-training methods of learning cross-modal representations...

7 Xiujun Li, et al. ∙

research

∙ 11/26/2019

FAN: Feature Adaptation Network for Surveillance Face Recognition and Normalization

This paper studies face recognition (FR) and normalization in surveillan...

8 Xi Yin, et al. ∙

research

∙ 04/09/2019

Gait Recognition via Disentangled Representation Learning

Gait, the walking pattern of individuals, is one of the most important b...

12 Ziyuan Zhang, et al. ∙

research

∙ 03/23/2018

Feature Transfer Learning for Deep Face Recognition with Long-Tail Data

Real-world face recognition datasets exhibit long-tail characteristics, ...

0 Xi Yin, et al. ∙

research

∙ 06/26/2017

Illuminating Pedestrians via Simultaneous Detection & Segmentation

Pedestrian detection is a critical problem in computer vision with signi...

0 Garrick Brazil, et al. ∙

research

∙ 05/31/2017

Representation Learning by Rotating Your Faces

The large pose discrepancy between two face images is one of the fundame...

0 Luan Tran, et al. ∙

research

∙ 05/02/2015

Joint Multi-Leaf Segmentation, Alignment and Tracking from Fluorescence Plant Videos

This paper proposes a novel framework for fluorescence plant video proce...

0 Xi Yin, et al. ∙

Xi Yin

Featured Co-authors

Sign in with Google

Consider DeepAI Pro