FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion

09/20/2023
by   Stefan Stan, et al.
0

Speech-driven 3D facial animation synthesis has been a challenging task both in industry and research. Recent methods mostly focus on deterministic deep learning methods meaning that given a speech input, the output is always the same. However, in reality, the non-verbal facial cues that reside throughout the face are non-deterministic in nature. In addition, majority of the approaches focus on 3D vertex based datasets and methods that are compatible with existing facial animation pipelines with rigged characters is scarce. To eliminate these issues, we present FaceDiffuser, a non-deterministic deep learning model to generate speech-driven facial animations that is trained with both 3D vertex and blendshape based datasets. Our method is based on the diffusion technique and uses the pre-trained large speech representation model HuBERT to encode the audio input. To the best of our knowledge, we are the first to employ the diffusion method for the task of speech-driven 3D facial animation synthesis. We have run extensive objective and subjective analyses and show that our approach achieves better or comparable results in comparison to the state-of-the-art methods. We also introduce a new in-house dataset that is based on a blendshape based rigged character. We recommend watching the accompanying supplementary video. The code and the dataset will be publicly available.

READ FULL TEXT

page 1

page 7

page 9

page 12

page 14

page 15

research
03/09/2023

FaceXHuBERT: Text-less Speech-driven E(X)pressive 3D Facial Animation Synthesis Using Self-Supervised Speech Representation Learning

This paper presents FaceXHuBERT, a text-less speech-driven 3D facial ani...
research
12/30/2022

Imitator: Personalized Speech-driven 3D Facial Animation

Speech-driven 3D facial animation has been widely explored, with applica...
research
05/15/2023

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models

Speech-driven animation has gained significant traction in recent years,...
research
04/15/2019

Synthesising 3D Facial Motion from "In-the-Wild" Speech

Synthesising 3D facial motion from speech is a crucial problem manifesti...
research
05/02/2022

A Novel Speech-Driven Lip-Sync Model with CNN and LSTM

Generating synchronized and natural lip movement with speech is one of t...
research
12/12/2019

Speech-driven facial animation using polynomial fusion of features

Speech-driven facial animation involves using a speech signal to generat...
research
04/30/2023

Deep Learning-based Spatio Temporal Facial Feature Visual Speech Recognition

In low-resource computing contexts, such as smartphones and other tiny d...

Please sign up or login with your details

Forgot password? Click here to reset