Hynek Hermansky

research

∙ 03/22/2023

Self-supervised Learning with Speech Modulation Dropout

We show that training a multi-headed self-attention-based deep network t...

0 Samik Sadhu, et al. ∙

research

∙ 03/07/2023

Stabilized training of joint energy-based models and their practical applications

The recently proposed Joint Energy-based Model (JEM) interprets discrimi...

0 Martin Sustek, et al. ∙

research

∙ 09/30/2022

Blind Signal Dereverberation for Machine Speech Recognition

We present a method to remove unknown convolutive noise introduced to sp...

0 Samik Sadhu, et al. ∙

research

∙ 03/31/2022

Importance of Different Temporal Modulations of Speech: A Tale of Two Perspectives

How important are different temporal speech modulations for speech recog...

0 Samik Sadhu, et al. ∙

research

∙ 03/24/2022

Complex Frequency Domain Linear Prediction: A Tool to Compute Modulation Spectrum of Speech

Conventional Frequency Domain Linear Prediction (FDLP) technique models ...

0 Samik Sadhu, et al. ∙

research

∙ 03/25/2021

Radically Old Way of Computing Spectra: Applications in End-to-End ASR

We propose a technique to compute spectrograms using Frequency Domain Li...

0 Samik Sadhu, et al. ∙

research

∙ 02/05/2021

Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream End-to-End ASR

Performance degradation of an Automatic Speech Recognition (ASR) system ...

0 Ruizhi Li, et al. ∙

research

∙ 10/23/2019

A practical two-stage training strategy for multi-stream end-to-end speech recognition

The multi-stream paradigm of audio processing, in which several sources ...

0 Ruizhi Li, et al. ∙

research

∙ 06/17/2019

Multi-Stream End-to-End Speech Recognition

Attention-based methods and Connectionist Temporal Classification (CTC) ...

0 Ruizhi Li, et al. ∙

research

∙ 04/09/2019

Performance Monitoring for End-to-End Speech Recognition

Measuring performance of an automatic speech recognition (ASR) system wi...

0 Ruizhi Li, et al. ∙

research

∙ 04/08/2019

Exploring Methods for the Automatic Detection of Errors in Manual Transcription

Quality of data plays an important role in most deep learning tasks. In ...

0 Xiaofei Wang, et al. ∙

research

∙ 11/12/2018

Stream attention-based multi-array end-to-end speech recognition

Automatic Speech Recognition (ASR) using multiple microphone arrays has ...

0 Xiaofei Wang, et al. ∙

research

∙ 11/12/2018

Multi-encoder multi-resolution framework for end-to-end speech recognition

Attention-based methods and Connectionist Temporal Classification (CTC) ...

0 Ruizhi Li, et al. ∙

research

∙ 11/29/2017

Stream Attention for far-field multi-microphone ASR

A stream attention framework has been applied to the posterior probabili...

0 Xiaofei Wang, et al. ∙

Hynek Hermansky

Featured Co-authors

Sign in with Google

Consider DeepAI Pro