Cascaded Cross-Modal Transformer for Request and Complaint Detection

07/27/2023
by   Nicolae-Catalin Ristea, et al.
0

We propose a novel cascaded cross-modal transformer (CCMT) that combines speech and text transcripts to detect customer requests and complaints in phone conversations. Our approach leverages a multimodal paradigm by transcribing the speech using automatic speech recognition (ASR) models and translating the transcripts into different languages. Subsequently, we combine language-specific BERT-based models with Wav2Vec2.0 audio features in a novel cascaded cross-attention transformer model. We apply our system to the Requests Sub-Challenge of the ACM Multimedia 2023 Computational Paralinguistics Challenge, reaching unweighted average recalls (UAR) of 65.41 the complaint and request classes, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/04/2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition

We propose a cross-modal transformer-based neural correction models that...
research
08/16/2023

Radio2Text: Streaming Speech Recognition Using mmWave Radio Signals

Millimeter wave (mmWave) based speech recognition provides more possibil...
research
05/31/2023

ViLaS: Integrating Vision and Language into Automatic Speech Recognition

Employing additional multimodal information to improve automatic speech ...
research
03/06/2021

Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision

Transformer architectures have brought about fundamental changes to comp...
research
07/10/2023

HCLAS-X: Hierarchical and Cascaded Lyrics Alignment System Using Multimodal Cross-Correlation

In this work, we address the challenge of lyrics alignment, which involv...
research
10/29/2019

Transformer-based Cascaded Multimodal Speech Translation

This paper describes the cascaded multimodal speech translation systems ...
research
04/28/2023

The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share Requests

The ACM Multimedia 2023 Computational Paralinguistics Challenge addresse...

Please sign up or login with your details

Forgot password? Click here to reset