We examine the speech modeling potential of generative spoken language
m...
We present JNV (Japanese Nonverbal Vocalizations) corpus, a corpus of
Ja...
We present a large-scale in-the-wild Japanese laughter corpus and a laug...
Pause insertion, also known as phrase break prediction and phrasing, is ...
We present a multi-speaker Japanese audiobook text-to-speech (TTS) syste...
In this paper, we propose a method for intermediating multiple speakers'...
We present an emotion recognition system for nonverbal vocalizations (NV...
This paper presents a speaking-rate-controllable HiFi-GAN neural vocoder...
We present the UTokyo-SaruLab mean opinion score (MOS) prediction system...