CML-TTS A Multilingual Dataset for Speech Synthesis in Low-Resource Languages

06/16/2023
by   Frederico S. Oliveira, et al.
0

In this paper, we present CML-TTS, a recursive acronym for CML-Multi-Lingual-TTS, a new Text-to-Speech (TTS) dataset developed at the Center of Excellence in Artificial Intelligence (CEIA) of the Federal University of Goias (UFG). CML-TTS is based on Multilingual LibriSpeech (MLS) and adapted for training TTS models, consisting of audiobooks in seven languages: Dutch, French, German, Italian, Portuguese, Polish, and Spanish. Additionally, we provide the YourTTS model, a multi-lingual TTS model, trained using 3,176.13 hours from CML-TTS and also with 245.07 hours from LibriTTS, in English. Our purpose in creating this dataset is to open up new research possibilities in the TTS area for multi-lingual models. The dataset is publicly available under the CC-BY 4.0 license1.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2020

MLS: A Large-Scale Multilingual Dataset for Speech Research

This paper introduces Multilingual LibriSpeech (MLS) dataset, a large mu...
research
06/07/2023

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

This work introduces Zambezi Voice, an open-source multilingual speech r...
research
02/21/2018

Sequence-based Multi-lingual Low Resource Speech Recognition

Techniques for multi-lingual and cross-lingual speech recognition can he...
research
09/15/2021

A Conditional Generative Matching Model for Multi-lingual Reply Suggestion

We study the problem of multilingual automated reply suggestions (RS) mo...
research
05/25/2023

Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration

This work aims to build a multilingual text-to-speech (TTS) synthesis sy...
research
11/23/2018

Learning pronunciation from a foreign language in speech synthesis networks

Although there are more than 65,000 languages in the world, the pronunci...
research
09/28/2021

Exploring Teacher-Student Learning Approach for Multi-lingual Speech-to-Intent Classification

End-to-end speech-to-intent classification has shown its advantage in ha...

Please sign up or login with your details

Forgot password? Click here to reset