We release Code Llama, a family of large language models for code based ...
Recent work has shown that it is possible to resynthesize high-quality s...
We tackle the task of conditional music generation. We introduce MusicGe...
Speech language models (SpeechLMs) process and generate acoustic data on...
Prior works on improving speech quality with visual input typically stud...
We introduce AudioScopeV2, a state-of-the-art universal audio-visual
on-...
In this paper we present VDTTS, a Visually-Driven Text-to-Speech model.
...
We present Translatotron 2, a neural direct speech-to-speech translation...
We introduce a state-of-the-art audio-visual on-screen sound separation
...
Recent progress in deep learning has enabled many advances in sound
sepa...
We propose a fully-convolutional neural-network architecture for image
d...
This paper presents a weakly-supervised approach to object instance
segm...
We present a method to match three dimensional shapes under non-isometri...
The increasing demand for high image quality in mobile devices brings fo...
Poisson distribution is used for modeling noise in photon-limited imagin...
With the development of range sensors such as LIDAR and time-of-flight
c...
We present a proof-of-concept end-to-end system for computational extend...
Recently, the dense binary pixel Gigavision camera had been introduced,
...
We present ASIST, a technique for transforming point clouds by replacing...
Spatially Coherent Random Forest (SCRF) extends Random Forest to create
...
The pursuit of smaller pixel sizes at ever increasing resolution in digi...