Parameter-Efficient Learning for Text-to-Speech Accent Adaptation

05/18/2023
by   Li-Jen Yang, et al.
0

This paper presents a parameter-efficient learning (PEL) to develop a low-resource accent adaptation for text-to-speech (TTS). A resource-efficient adaptation from a frozen pre-trained TTS model is developed by using only 1.2% to 0.8% of original trainable parameters to achieve competitive performance in voice synthesis. Motivated by a theoretical foundation of optimal transport (OT), this study carries out PEL for TTS where an auxiliary unsupervised loss based on OT is introduced to maximize a difference between the pre-trained source domain and the (unseen) target domain, in addition to its supervised training loss. Further, we leverage upon this unsupervised loss refinement to boost system performance via either sliced Wasserstein distance or maximum mean discrepancy. The merit of this work is demonstrated by fulfilling PEL solutions based on residual adapter learning, and model reprogramming when evaluating the Mandarin accent adaptation. Experiment results show that the proposed methods can achieve competitive naturalness with parameter-efficient decoder fine-tuning, and the auxiliary unsupervised loss improves model performance empirically.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/27/2023

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition

Automatic recognition of disordered and elderly speech remains highly ch...
research
11/02/2022

Low-Resource Music Genre Classification with Advanced Neural Model Reprogramming

Transfer learning (TL) approaches have shown promising results when hand...
research
06/09/2021

Neural Supervised Domain Adaptation by Augmenting Pre-trained Models with Random Units

Neural Transfer Learning (TL) is becoming ubiquitous in Natural Language...
research
04/07/2019

Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging

Fine-tuning neural networks is widely used to transfer valuable knowledg...
research
03/27/2023

Scaling Pre-trained Language Models to Deeper via Parameter-efficient Architecture

In this paper, we propose a highly parameter-efficient approach to scali...
research
09/21/2023

Sparsely Shared LoRA on Whisper for Child Speech Recognition

Whisper is a powerful automatic speech recognition (ASR) model. Neverthe...
research
05/19/2023

Differentially Private Adapters for Parameter Efficient Acoustic Modeling

In this work, we devise a parameter-efficient solution to bring differen...

Please sign up or login with your details

Forgot password? Click here to reset