Prodorshok I: A Bengali Isolated Speech Dataset for Voice-Based Assistive Technologies - A comparative analysis of the effects of data augmentation on HMM-GMM and DNN classifie

12/10/2017
by   Mohi Reza, et al.
0

Prodorshok I is a Bengali isolated word dataset tailored to help create speaker-independent, voice-command driven automated speech recognition (ASR) based assistive technologies to help improve human-computer interaction (HCI). This paper presents the results of an objective analysis that was undertaken using a subset of words from Prodorshok I to assess its reliability in ASR systems that utilize Hidden Markov Models (HMM) with Gaussian emissions and Deep Neural Networks (DNN). The results show that simple data augmentation involving a small pitch shift can make surprisingly tangible improvements to accuracy levels in speech recognition.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset