Alexis Moinet

research

∙ 09/04/2023

A Comparative Analysis of Pretrained Language Models for Text-to-Speech

State-of-the-art text-to-speech (TTS) systems have utilized pretrained l...

0 Marcel Granero Moya, et al. ∙

research

∙ 07/13/2023

Controllable Emphasis with zero data for text-to-speech

We present a scalable method to produce high quality emphasis for text-t...

0 Arnaud Joly, et al. ∙

research

∙ 06/20/2023

eCat: An End-to-End Model for Multi-Speaker TTS Many-to-Many Fine-Grained Prosody Transfer

We present eCat, a novel end-to-end multispeaker model capable of: a) ge...

0 Ammar Abbas, et al. ∙

research

∙ 06/29/2022

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

Generating expressive and contextually appropriate prosody remains a cha...

0 Peter Makarov, et al. ∙

research

∙ 06/28/2022

Expressive, Variable, and Controllable Duration Modelling in TTS

Duration modelling has become an important research problem once more wi...

0 Ammar Abbas, et al. ∙

research

∙ 06/27/2022

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

In this paper, we present CopyCat2 (CC2), a novel model capable of: a) s...

0 Sri Karlapati, et al. ∙

research

∙ 02/13/2022

Distribution augmentation for low-resource expressive text-to-speech

This paper presents a novel data augmentation technique for text-to-spee...

0 Mateusz Łajszczak, et al. ∙

research

∙ 06/29/2021

Multi-Scale Spectrogram Modelling for Neural Text-to-Speech

We propose a novel Multi-Scale Spectrogram (MSS) modelling approach to s...

0 Ammar Abbas, et al. ∙

research

∙ 06/14/2021

A learned conditional prior for the VAE acoustic space of a TTS system

Many factors influence speech yielding different renditions of a given s...

0 Penny Karanasou, et al. ∙

research

∙ 12/17/2020

Parallel WaveNet conditioned on VAE latent vectors

Recently the state-of-the-art text-to-speech synthesis systems have shif...

0 Jonas Rohnke, et al. ∙

research

∙ 11/04/2020

Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech

In this paper, we introduce Kathaka, a model trained with a novel two-st...

0 Sri Karlapati, et al. ∙

research

∙ 05/24/2020

Glottal source estimation robustness: A comparison of sensitivity of voice source estimation techniques

This paper addresses the problem of estimating the voice source directly...

0 Thomas Drugman, et al. ∙

research

∙ 04/30/2020

CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech

Prosody Transfer (PT) is a technique that aims to use the prosody from a...

0 Sri Karlapati, et al. ∙

research

∙ 12/30/2019

Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis

This paper proposes a method to improve the quality delivered by statist...

0 Thomas Drugman, et al. ∙

research

∙ 12/12/2019

Singing Synthesis: with a little help from my attention

We present a novel system for singing synthesis, based on attention. Sta...

0 Orazio Angelini, et al. ∙

research

∙ 12/11/2019

Voice Conversion for Whispered Speech Synthesis

We present an approach to synthesize whisper by applying a handcrafted s...

0 Marius Cotescu, et al. ∙

research

∙ 03/04/2019

Traditional Machine Learning for Pitch Detection

Pitch detection is a fundamental problem in speech processing as F0 is u...

0 Thomas Drugman, et al. ∙

research

∙ 11/15/2018

Comprehensive evaluation of statistical speech waveform synthesis

Statistical TTS systems that directly predict the speech waveform have r...

0 Thomas Merritt, et al. ∙

research

∙ 01/19/2018

Proceedings of eNTERFACE 2015 Workshop on Intelligent Interfaces

The 11th Summer Workshop on Multimodal Interfaces eNTERFACE 2015 was hos...

0 Matei Mancas, et al. ∙

Alexis Moinet

Featured Co-authors

Sign in with Google

Consider DeepAI Pro