Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model

04/07/2022
by   Nick J. C. Wang, et al.
0

In spoken language understanding (SLU), what the user says is converted to his/her intent. Recent work on end-to-end SLU has shown that accuracy can be improved via pre-training approaches. We revisit ideas presented by Lugosch et al. using speech pre-training and three-module modeling; however, to ease construction of the end-to-end SLU model, we use as our phoneme module an open-source acoustic-phonetic model from a DNN-HMM hybrid automatic speech recognition (ASR) system instead of training one from scratch. Hence we fine-tune on speech only for the word module, and we apply multi-target learning (MTL) on the word and intent modules to jointly optimize SLU performance. MTL yields a relative reduction of 40 error rates (from 1.0 streaming method. The final outcome of the proposed three-module modeling approach yields an intent accuracy of 99.4 rate reduction of 50 real-time streaming methods, we also list non-streaming methods for comparison.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2021

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

A major focus of recent research in spoken language understanding (SLU) ...
research
04/07/2019

Speech Model Pre-training for End-to-End Spoken Language Understanding

Whereas conventional spoken language understanding (SLU) systems map spe...
research
08/05/2020

Improving End-to-End Speech-to-Intent Classification with Reptile

End-to-end spoken language understanding (SLU) systems have many advanta...
research
03/30/2021

Pre-training for low resource speech-to-intent applications

Designing a speech-to-intent (S2I) agent which maps the users' spoken co...
research
09/24/2018

From Audio to Semantics: Approaches to end-to-end spoken language understanding

Conventional spoken language understanding systems consist of two main c...
research
04/08/2022

A Study of Different Ways to Use The Conformer Model For Spoken Language Understanding

SLU combines ASR and NLU capabilities to accomplish speech-to-intent und...
research
05/20/2021

A Streaming End-to-End Framework For Spoken Language Understanding

End-to-end spoken language understanding (SLU) has recently attracted in...

Please sign up or login with your details

Forgot password? Click here to reset