James Qin

research

∙ 06/22/2023

AudioPaLM: A Large Language Model That Can Speak and Listen

We introduce AudioPaLM, a large language model for speech understanding ...

0 Paul K. Rubenstein, et al. ∙

research

∙ 06/13/2023

Efficient Adapters for Giant Speech Models

Large pre-trained speech models are widely used as the de-facto paradigm...

0 Nanxin Chen, et al. ∙

research

∙ 03/02/2023

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

We introduce the Universal Speech Model (USM), a single large model that...

0 Yu Zhang, et al. ∙

research

∙ 02/03/2022

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

We present a simple and effective self-supervised learning approach for ...

0 Chung-Cheng Chiu, et al. ∙

research

∙ 10/09/2021

Vector-quantized Image Modeling with Improved VQGAN

Pretraining language models with next-token prediction on massive text c...

0 Jiahui Yu, et al. ∙

research

∙ 09/27/2021

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

We summarize the results of a host of efforts using giant automatic spee...

1 Yu Zhang, et al. ∙

research

∙ 08/07/2021

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

Motivated by the success of masked language modeling (MLM) in pre-traini...

0 Yu-An Chung, et al. ∙

research

∙ 04/30/2021

Scaling End-to-End Models for Large-Scale Multilingual ASR

Building ASR models across many language families is a challenging multi...

14 Bo Li, et al. ∙

research

∙ 11/21/2020

A Better and Faster End-to-End Model for Streaming ASR

End-to-end (E2E) models have shown to outperform state-of-the-art conven...

0 Bo Li, et al. ∙

research

∙ 10/20/2020

Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition

We employ a combination of recent developments in semi-supervised learni...

0 Yu Zhang, et al. ∙

research

∙ 08/30/2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition

Recent advances of end-to-end models have outperformed conventional mode...

0 Wei Li, et al. ∙

research

∙ 05/16/2020

Conformer: Convolution-augmented Transformer for Speech Recognition

Recently Transformer and Convolution neural network (CNN) based models h...

0 Anmol Gulati, et al. ∙

research

∙ 05/07/2020

ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context

Convolutional neural networks (CNN) have shown promising results for end...

0 Wei Han, et al. ∙

research

∙ 02/21/2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Lingvo is a Tensorflow framework offering a complete solution for collab...

13 Jonathan Shen, et al. ∙

James Qin

Featured Co-authors

Sign in with Google

Consider DeepAI Pro