We present Spatial LibriSpeech, a spatial audio dataset with over 650 ho...
Generating realistic lip motions to simulate speech production is key fo...
As the use of deep learning in high impact domains becomes ubiquitous, i...
Deployed machine learning models are evaluated by multiple metrics beyon...
Image augmentations applied during training are crucial for the
generali...
We study the presence of expert units in pre-trained Transformer-based
L...
Automatic speech recognition (ASR) is widely used in consumer electronic...
To detect bias in face recognition networks, it can be useful to probe a...
We describe our novel deep learning approach for driving animated faces ...
In this work we study the presence of expert units in pre-trained Transf...
Speech-driven visual speech synthesis involves mapping features extracte...
We describe experiments towards building a conversational digital assist...
We propose a method for modeling and learning turn-taking behaviors for
...
Principal Filter Analysis (PFA), is an elegant, easy to implement, yet
e...