Classifying topics in speech when all you have is crummy translations

08/29/2019
by   Sameer Bansal, et al.
0

Given a large amount of unannotated speech in a language with few resources, can we classify the speech utterances by topic? We show that this is possible if text translations are available for just a small amount of speech (less than 20 hours), using a recent model for direct speech-to-text translation. While the translations are poor, they are still good enough to correctly classify 1-minute speech segments over 70 majority-class baseline. Such a system might be useful for humanitarian applications like crisis response, where incoming speech must be quickly assessed for further action.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/08/2022

SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations

We present SpeechMatrix, a large-scale multilingual corpus of speech-to-...
research
02/09/2018

Augmenting Librispeech with French Translations: A Multimodal Corpus for Direct Speech Translation Evaluation

Recent works in spoken language translation (SLT) have attempted to buil...
research
02/25/2023

Jointly Optimizing Translations and Speech Timing to Improve Isochrony in Automatic Dubbing

Automatic dubbing (AD) is the task of translating the original speech in...
research
06/03/2019

Fluent Translations from Disfluent Speech in End-to-End Speech Translation

Spoken language translation applications for speech suffer due to conver...
research
01/13/2022

Speech Resources in the Tamasheq Language

In this paper we present two datasets for Tamasheq, a developing languag...
research
04/13/2023

Bidirectional UML Visualisation of VDM Models

The VDM-PlantUML Plugin enables translations between the text based UML ...
research
02/14/2017

A case study on using speech-to-translation alignments for language documentation

For many low-resource or endangered languages, spoken language resources...

Please sign up or login with your details

Forgot password? Click here to reset