Employing Hybrid Deep Neural Networks on Dari Speech

05/04/2023
by   Jawid Ahmad Baktash, et al.
0

This paper is an extension of our previous conference paper. In recent years, there has been a growing interest among researchers in developing and improving speech recognition systems to facilitate and enhance human-computer interaction. Today, Automatic Speech Recognition (ASR) systems have become ubiquitous, used in everything from games to translation systems, robots, and more. However, much research is still needed on speech recognition systems for low-resource languages. This article focuses on the recognition of individual words in the Dari language using the Mel-frequency cepstral coefficients (MFCCs) feature extraction method and three different deep neural network models: Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Multilayer Perceptron (MLP), as well as two hybrid models combining CNN and RNN. We evaluate these models using an isolated Dari word corpus that we have created, consisting of 1000 utterances for 20 short Dari terms. Our study achieved an impressive average accuracy of 98.365

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2021

Is Attention always needed? A Case Study on Language Identification from Speech

Language Identification (LID), a recommended initial step to Automatic S...
research
02/14/2021

Thank you for Attention: A survey on Attention-based Artificial Neural Networks for Automatic Speech Recognition

Attention is a very popular and effective mechanism in artificial neural...
research
10/26/2020

Decentralizing Feature Extraction with Quantum Convolutional Neural Network for Automatic Speech Recognition

We propose a novel decentralized feature extraction approach in federate...
research
08/29/2021

Attempt to Predict Failure Case Classification in a Failure Database by using Neural Network Models

With the recent progress of information technology, the use of networked...
research
01/27/2017

A Comprehensive Survey on Bengali Phoneme Recognition

Hidden Markov model based various phoneme recognition methods for Bengal...
research
02/18/2018

Improved TDNNs using Deep Kernels and Frequency Dependent Grid-RNNs

Time delay neural networks (TDNNs) are an effective acoustic model for l...
research
02/03/2021

Effects of Number of Filters of Convolutional Layers on Speech Recognition Model Accuracy

Inspired by the progress of the End-to-End approach [1], this paper syst...

Please sign up or login with your details

Forgot password? Click here to reset