We explore the use of neural synthesis for acoustic guitar from string-w...
The success of deep learning in speaker recognition relies heavily on th...
With the growing amount of musical data available, automatic instrument
...
Speaker anonymization aims to conceal a speaker's identity while preserv...
Spoof localization, also called segment-level detection, is a crucial ta...
With the similarity between music and speech synthesis from symbolic inp...
Conventional automatic speaker verification systems can usually be decom...
Automatic speaker verification is susceptible to various manipulations a...
In our previous work, we proposed a language-independent speaker
anonymi...
We present the first edition of the VoiceMOS Challenge, a scientific eve...
Speaker anonymization aims to protect the privacy of speakers while
pres...
An effective approach to automatically predict the subjective rating for...
Are end-to-end text-to-speech (TTS) models over-parametrized? To what ex...
In this paper, we provide a series of multi-tasking benchmarks for
simul...
Timbre representations of musical instruments, essential for diverse
app...
Shared challenges provide a venue for comparing systems trained on commo...
This work examines the content and usefulness of disentangled phone and
...
Speech synthesis and music audio generation from symbolic input differ i...
All existing databases of spoofed speech contain attack data that is spo...
A back-end model is a key element of modern speaker verification systems...
We explore pretraining strategies including choice of base corpus with t...
We have been working on speech synthesis for rakugo (a traditional Japan...
We present a new approach to disentangle speaker voice and phone content...
End-to-end models, particularly Tacotron-based ones, are currently a pop...
Vector Quantized Variational AutoEncoders (VQ-VAE) are a powerful
repres...