Conversational End-to-End TTS for Voice Agent

05/21/2020
by   Haohan Guo, et al.
0

End-to-end neural TTS has achieved superior performance on reading style speech synthesis. However, it's still a challenge to build a high-quality conversational TTS due to the limitations of the corpus and modeling capability. This study aims at building a conversational TTS for a voice agent under sequence to sequence modeling framework. We firstly construct a spontaneous conversational speech corpus well designed for the voice agent with a new recording scheme ensuring both recording quality and conversational speaking style. Secondly, we propose a conversation context-aware end-to-end TTS approach which has an auxiliary encoder and a conversational context encoder to reinforce the information about the current utterance and its context in a conversation as well. Experimental results show that the proposed methods produce more natural prosody in accordance with the conversational context, with significant preference gains at both utterance-level and conversation-level. Moreover, we find that the model has the ability to express some spontaneous behaviors, like fillers and repeated words, which makes the conversational speaking style more realistic.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/04/2019

An End-to-End Conversational Style Matching Agent

We present an end-to-end voice-based conversational agent that is able t...
research
05/02/2022

How does a spontaneously speaking conversational agent affect user behavior?

This study investigated the effect of synthetic voice of conversational ...
research
09/09/2018

Attentional Multi-Reading Sarcasm Detection

Recognizing sarcasm often requires a deep understanding of multiple sour...
research
06/27/2019

Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion

We present a novel conversational-context aware end-to-end speech recogn...
research
06/21/2021

Controllable Context-aware Conversational Speech Synthesis

In spoken conversations, spontaneous behaviors like filled pause and pro...
research
05/22/2018

Context-Aware Sequence-to-Sequence Models for Conversational Systems

This work proposes a novel approach based on sequence-to-sequence (seq2s...
research
07/07/2021

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

Over the last several years, end-to-end neural conversational agents hav...

Please sign up or login with your details

Forgot password? Click here to reset