GMM-Free Flat Start Sequence-Discriminative DNN Training

10/11/2016
by   Gábor Gosztolya, et al.
0

Recently, attempts have been made to remove Gaussian mixture models (GMM) from the training process of deep neural network-based hidden Markov models (HMM/DNN). For the GMM-free training of a HMM/DNN hybrid we have to solve two problems, namely the initial alignment of the frame-level state labels and the creation of context-dependent states. Although flat-start training via iteratively realigning and retraining the DNN using a frame-level error function is viable, it is quite cumbersome. Here, we propose to use a sequence-discriminative training criterion for flat start. While sequence-discriminative training is routinely applied only in the final phase of model training, we show that with proper caution it is also suitable for getting an alignment of context-independent DNN models. For the construction of tied states we apply a recently proposed KL-divergence-based state clustering method, hence our whole training process is GMM-free. In the experimental evaluation we found that the sequence-discriminative flat start training method is not only significantly faster than the straightforward approach of iterative retraining and realignment, but the word error rates attained are slightly better as well.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2018

Sequence Training of DNN Acoustic Models With Natural Gradient

Deep Neural Network (DNN) acoustic models often use discriminative seque...
research
08/04/2016

An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems

In this paper, we advance a recently-proposed uncertainty decoding schem...
research
10/30/2019

Hidden Markov Models for sepsis detection in preterm infants

We explore the use of traditional and contemporary hidden Markov models ...
research
04/06/2021

Towards Consistent Hybrid HMM Acoustic Modeling

High-performance hybrid automatic speech recognition (ASR) systems are o...
research
10/28/2017

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verif...
research
10/26/2022

HEiMDaL: Highly Efficient Method for Detection and Localization of wake-words

Streaming keyword spotting is a widely used solution for activating voic...
research
05/17/2020

Wake Word Detection with Alignment-Free Lattice-Free MMI

Always-on spoken language interfaces, e.g. personal digital assistants, ...

Please sign up or login with your details

Forgot password? Click here to reset