Dialogs Re-enacted Across Languages

11/18/2022
by   Nigel G. Ward, et al.
0

To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection, and some observations and musings. This report is intended for 1) people using the corpus, 2) people extending the corpus, and 3) people designing similar collections of bilingual dialog data.

READ FULL TEXT

page 28

page 29

page 31

page 37

research
07/09/2023

Towards cross-language prosody transfer for dialog

Speech-to-speech translation systems today do not adequately support use...
research
12/06/2016

Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation

This paper proposes a first attempt to build an end-to-end speech-to-tex...
research
08/25/2022

Kencorpus: A Kenyan Language Corpus of Swahili, Dholuo and Luhya for Natural Language Processing Tasks

Indigenous African languages are categorized as under-served in Artifici...
research
07/30/2019

MaSS: A Large and Clean Multilingual Corpus of Sentence-aligned Spoken Utterances Extracted from the Bible

The CMU Wilderness Multilingual Speech Dataset is a newly published mult...
research
06/26/2022

Annotated Speech Corpus for Low Resource Indian Languages: Awadhi, Bhojpuri, Braj and Magahi

In this paper we discuss an in-progress work on the development of a spe...
research
01/23/2017

A Multichannel Convolutional Neural Network For Cross-language Dialog State Tracking

The fifth Dialog State Tracking Challenge (DSTC5) introduces a new cross...
research
02/20/2017

Parent Oriented Teacher Selection Causes Language Diversity

An evolutionary model for emergence of diversity in language is develope...

Please sign up or login with your details

Forgot password? Click here to reset