Automatic Heteronym Resolution Pipeline Using RAD-TTS Aligners

02/28/2023
by   Jocelyn Huang, et al.
0

Grapheme-to-phoneme (G2P) transduction is part of the standard text-to-speech (TTS) pipeline. However, G2P conversion is difficult for languages that contain heteronyms – words that have one spelling but can be pronounced in multiple ways. G2P datasets with annotated heteronyms are limited in size and expensive to create, as human labeling remains the primary method for heteronym disambiguation. We propose a RAD-TTS Aligner-based pipeline to automatically disambiguate heteronyms in datasets that contain both audio with text transcripts. The best pronunciation can be chosen by generating all possible candidates for each heteronym and scoring them with an Aligner model. The resulting labels can be used to create training datasets for use in both multi-stage and end-to-end G2P systems.

READ FULL TEXT
research
10/29/2018

Investigation of enhanced Tacotron text-to-speech synthesis systems with self-attention for pitch accent language

End-to-end speech synthesis is a promising approach that directly conver...
research
11/04/2020

Handwriting Classification for the Analysis of Art-Historical Documents

Digitized archives contain and preserve the knowledge of generations of ...
research
09/22/2022

Characterizing Uncertainty in the Visual Text Analysis Pipeline

Current visual text analysis approaches rely on sophisticated processing...
research
07/31/2023

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Single-stage text-to-speech models have been actively studied recently, ...
research
04/11/2021

NeMo Toolbox for Speech Dataset Construction

In this paper, we introduce a new toolbox for constructing speech datase...
research
07/31/2020

An Empirical Study on Explainable Prediction of Text Complexity: Preliminaries for Text Simplification

Text simplification is concerned with reducing the language complexity a...
research
10/25/2021

"So You Think You're Funny?": Rating the Humour Quotient in Standup Comedy

Computational Humour (CH) has attracted the interest of Natural Language...

Please sign up or login with your details

Forgot password? Click here to reset