Language-Agnostic Meta-Learning for Low-Resource Text-to-Speech with Articulatory Features

03/07/2022
by   Florian Lux, et al.
0

While neural text-to-speech systems perform remarkably well in high-resource scenarios, they cannot be applied to the majority of the over 6,000 spoken languages in the world due to a lack of appropriate training data. In this work, we use embeddings derived from articulatory vectors rather than embeddings derived from phoneme identities to learn phoneme representations that hold across languages. In conjunction with language agnostic meta learning, this enables us to fine-tune a high-quality text-to-speech model on just 30 minutes of data in a previously unseen language spoken by a previously unseen speaker.

READ FULL TEXT

page 5

page 7

research
10/21/2022

Low-Resource Multilingual and Zero-Shot Multispeaker TTS

While neural methods for text-to-speech (TTS) have shown great advances ...
research
10/26/2019

Meta Learning for End-to-End Low-Resource Speech Recognition

In this paper, we proposed to apply meta learning approach for low-resou...
research
01/20/2023

Language Agnostic Data-Driven Inverse Text Normalization

With the emergence of automatic speech recognition (ASR) models, convert...
research
06/07/2021

SIGTYP 2021 Shared Task: Robust Spoken Language Identification

While language identification is a fundamental speech and language proce...
research
05/11/2022

Improved Meta Learning for Low Resource Speech Recognition

We propose a new meta learning based framework for low resource speech r...
research
05/24/2020

When does MAML Work the Best? An Empirical Study on Model-Agnostic Meta-Learning in NLP Applications

Model-Agnostic Meta-Learning (MAML), a model-agnostic meta-learning meth...
research
03/11/2022

Improving the transferability of speech separation by meta-learning

Speech separation aims to separate multiple speech sources from a speech...

Please sign up or login with your details

Forgot password? Click here to reset