Speech Emotion Recognition using Semantic Information

03/04/2021
by   Panagiotis Tzirakis, et al.
0

Speech emotion recognition is a crucial problem manifesting in a multitude of applications such as human computer interaction and education. Although several advancements have been made in the recent years, especially with the advent of Deep Neural Networks (DNN), most of the studies in the literature fail to consider the semantic information in the speech signal. In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal. In particular, our framework is comprised of a semantic feature extractor, that captures the semantic information, and a paralinguistic feature extractor, that captures the paralinguistic information. Both semantic and paraliguistic features are then combined to a unified representation using a novel attention mechanism. The unified feature vector is passed through a LSTM to capture the temporal dynamics in the signal, before the final prediction. To validate the effectiveness of our framework, we use the popular SEWA dataset of the AVEC challenge series and compare with the three winning papers. Our model provides state-of-the-art results in the valence and liking dimensions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/04/2021

Facial Emotion Recognition using Deep Residual Networks in Real-World Environments

Automatic affect recognition using visual cues is an important task towa...
research
02/07/2017

MORSE: Semantic-ally Drive-n MORpheme SEgment-er

We present in this paper a novel framework for morpheme segmentation whi...
research
05/17/2018

Convolutional Attention Networks for Multimodal Emotion Recognition from Speech and Text Data

Emotion recognition has become a popular topic of interest, especially i...
research
04/28/2022

Emotion Recognition In Persian Speech Using Deep Neural Networks

Speech Emotion Recognition (SER) is of great importance in Human-Compute...
research
05/30/2023

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

In large part due to their implicit semantic modeling, self-supervised l...
research
10/31/2020

Efficient Arabic emotion recognition using deep neural networks

Emotion recognition from speech signal based on deep learning is an acti...
research
07/12/2017

A breakthrough in Speech emotion recognition using Deep Retinal Convolution Neural Networks

Speech emotion recognition (SER) is to study the formation and change of...

Please sign up or login with your details

Forgot password? Click here to reset