We present TokenSplit, a speech separation model that acts on discrete t...
We introduce AudioPaLM, a large language model for speech understanding ...
We present SoundStorm, a model for efficient, non-autoregressive audio
g...
We introduce LMCodec, a causal neural speech codec that provides high qu...
We introduce SPEAR-TTS, a multi-speaker text-to-speech (TTS) system that...
We introduce MusicLM, a model generating high-fidelity music from text
d...
We introduce AudioLM, a framework for high-quality audio generation with...
We propose a method of separating a desired sound source from a
single-c...
Typically, neural network-based speech dereverberation models are traine...
We present a method to separate speech signals from noisy environments i...
With the growth of computing power on mobile phones and privacy concerns...
We propose SpeechPainter, a model for filling in gaps of up to one secon...
The increasing availability of massive data sets poses a series of chall...
We present SoundStream, a novel neural audio codec that can efficiently
...
Real-world sound scenes consist of time-varying collections of sound sou...
Mel-filterbanks are fixed, engineered audio features which emulate human...
In this paper we propose a lightweight model for frequency bandwidth
ext...
A crucial aspect for the successful deployment of audio-based models
"in...
Active learning is an effective technique for reducing the labeling cost...
We explore the possibility of leveraging accelerometer data to perform s...
We propose an audio-to-audio neural network model that learns to denoise...
The ultimate goal of transfer learning is to reduce labeled data require...
We learn audio representations by solving a novel self-supervised learni...
We propose a model to estimate the fundamental frequency in monophonic a...
We consider the problem of generating plausible and diverse video sequen...
We explore self-supervised models that can be potentially deployed on mo...
Pedestrian detection is a popular research topic due to its paramount
im...
We present a system for complementing snow phenomena monitoring with vir...
A number of computer vision tasks exploit a succinct representation of t...
Distributed visual analysis applications, such as mobile visual search o...
Binary local features represent an effective alternative to real-valued
...