Ozlem Kalinli

research

∙ 09/19/2023

End-to-End Speech Recognition Contextualization with Large Language Models

In recent years, Large Language Models (LLMs) have garnered significant ...

0 Egor Lakomkin, et al. ∙

research

∙ 09/17/2023

Augmenting text for spoken language understanding with Large Language Models

Spoken semantic parsing (SSP) involves generating machine-comprehensible...

0 Roshan Sharma, et al. ∙

research

∙ 09/12/2023

Recovering from Privacy-Preserving Masking with Large Language Models

Model adaptation is crucial to handle the discrepancy between proxy trai...

0 Arpita Vats, et al. ∙

research

∙ 09/05/2023

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Automatic Speech Recognition (ASR) models need to be optimized for speci...

0 Yuan Shangguan, et al. ∙

research

∙ 09/01/2023

Contextual Biasing of Named-Entities with Large Language Models

This paper studies contextual biasing with Large Language Models (LLMs),...

0 Chuanneng Sun, et al. ∙

research

∙ 07/22/2023

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding

End-to-end (E2E) spoken language understanding (SLU) systems that genera...

0 Suyoun Kim, et al. ∙

research

∙ 07/21/2023

Prompting Large Language Models with Speech Recognition Abilities

Large language models have proven themselves highly flexible, able to so...

0 Yassir Fathullah, et al. ∙

research

∙ 05/30/2023

Towards Selection of Text-to-speech Data to Augment ASR Training

This paper presents a method for selecting appropriate synthetic speech ...

0 Shuo Liu, et al. ∙

research

∙ 05/21/2023

Multi-Head State Space Model for Speech Recognition

State space models (SSMs) have recently shown promising results on small...

0 Yassir Fathullah, et al. ∙

research

∙ 11/10/2022

Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities

End-to-end multilingual ASR has become more appealing because of several...

0 Andros Tjandra, et al. ∙

research

∙ 10/31/2022

Joint Audio/Text Training for Transformer Rescorer of Streaming Speech Recognition

Recently, there has been an increasing interest in two-pass streaming en...

0 Suyoun Kim, et al. ∙

research

∙ 10/20/2022

Anchored Speech Recognition with Neural Transducers

Neural transducers have gained popularity in production ASR systems, ach...

0 Desh Raj, et al. ∙

research

∙ 09/13/2022

Learning ASR pathways: A sparse multilingual ASR model

Neural network pruning can be effectively applied to compress automatic ...

8 Mu Yang, et al. ∙

research

∙ 07/25/2022

Learning a Dual-Mode Speech Recognition Model via Self-Pruning

There is growing interest in unifying the streaming and full-context aut...

0 Chunxi Liu, et al. ∙

research

∙ 04/04/2022

Deliberation Model for On-Device Spoken Language Understanding

We propose a novel deliberation-based approach to end-to-end (E2E) spoke...

1 Duc Le, et al. ∙

research

∙ 03/30/2022

Federated Domain Adaptation for ASR with Full Self-Supervision

Cross-device federated learning (FL) protects user privacy by collaborat...

0 Junteng Jia, et al. ∙

research

∙ 03/29/2022

Streaming parallel transducer beam search with fast-slow cascaded encoders

Streaming ASR with strict latency constraints is required in many speech...

0 Jay Mahadeokar, et al. ∙

research

∙ 01/28/2022

Neural-FST Class Language Model for End-to-End Speech Recognition

We propose Neural-FST Class Language Model (NFCLM) for end-to-end speech...

0 Antoine Bruguier, et al. ∙

research

∙ 11/10/2021

Scaling ASR Improves Zero and Few Shot Learning

With 4.5 million hours of English speech from 10 different sources acros...

0 Alex Xiao, et al. ∙

research

∙ 10/15/2021

Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet

From wearables to powerful smart devices, modern automatic speech recogn...

0 Haichuan Yang, et al. ∙

research

∙ 10/11/2021

Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric

Measuring automatic speech recognition (ASR) system quality is critical ...

0 Suyoun Kim, et al. ∙

research

∙ 10/07/2021

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution

This paper improves the streaming transformer transducer for speech reco...

0 Yangyang Shi, et al. ∙

research

∙ 10/07/2021

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Detection of common events and scenes from audio is useful for extractin...

0 Dawei Liang, et al. ∙

research

∙ 07/09/2021

Noisy Training Improves E2E ASR for the Edge

Automatic speech recognition (ASR) has become increasingly ubiquitous on...

0 Dilin Wang, et al. ∙

research

∙ 06/16/2021

Collaborative Training of Acoustic Encoders for Speech Recognition

On-device speech recognition requires training models of different sizes...

0 Varun Nagaraja, et al. ∙

research

∙ 04/06/2021

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios

Often, the storage and computational constraints of embeddeddevices dema...

0 Jay Mahadeokar, et al. ∙

research

∙ 04/06/2021

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

As speech-enabled devices such as smartphones and smart speakers become ...

0 Yuan Shangguan, et al. ∙

research

∙ 04/05/2021

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion

How to leverage dynamic contextual information in end-to-end speech reco...

0 Duc Le, et al. ∙

research

∙ 04/05/2021

Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency

We propose a dynamic encoder transducer (DET) for on-device speech recog...

0 Yangyang Shi, et al. ∙

research

∙ 04/05/2021

Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding

Word Error Rate (WER) has been the predominant metric used to evaluate t...

0 Suyoun Kim, et al. ∙

research

∙ 09/05/2019

Bandwidth Embeddings for Mixed-bandwidth Speech Recognition

In this paper, we tackle the problem of handling narrowband and wideband...

0 Gautam Mantena, et al. ∙

Ozlem Kalinli

Featured Co-authors

Sign in with Google

Consider DeepAI Pro