LatentAvatar: Learning Latent Expression Code for Expressive Neural Head Avatar

by   Yuelang Xu, et al.

Existing approaches to animatable NeRF-based head avatars are either built upon face templates or use the expression coefficients of templates as the driving signal. Despite the promising progress, their performances are heavily bound by the expression power and the tracking accuracy of the templates. In this work, we present LatentAvatar, an expressive neural head avatar driven by latent expression codes. Such latent expression codes are learned in an end-to-end and self-supervised manner without templates, enabling our method to get rid of expression and tracking issues. To achieve this, we leverage a latent head NeRF to learn the person-specific latent expression codes from a monocular portrait video, and further design a Y-shaped network to learn the shared latent expression codes of different subjects for cross-identity reenactment. By optimizing the photometric reconstruction objectives in NeRF, the latent expression codes are learned to be 3D-aware while faithfully capturing the high-frequency detailed expressions. Moreover, by learning a mapping between the latent expression code learned in shared and person-specific settings, LatentAvatar is able to perform expressive reenactment between different subjects. Experimental results show that our LatentAvatar is able to capture challenging expressions and the subtle movement of teeth and even eyeballs, which outperforms previous state-of-the-art solutions in both quantitative and qualitative comparisons. Project page:


page 1

page 4

page 5

page 6

page 7

page 8


Learning an Animatable Detailed 3D Face Model from In-The-Wild Images

While current monocular 3D face reconstruction methods can recover fine ...

PVP: Personalized Video Prior for Editable Dynamic Portraits using StyleGAN

Portrait synthesis creates realistic digital avatars which enable users ...

DiffusionRig: Learning Personalized Priors for Facial Appearance Editing

We address the problem of learning person-specific facial priors from a ...

Learning Complete 3D Morphable Face Models from Images and Videos

Most 3D face reconstruction methods rely on 3D morphable models, which d...

Neural Head Reenactment with Latent Pose Descriptors

We propose a neural head reenactment system, which is driven by a latent...

Automatic generation of CUDA code performing tensor manipulations using C++ expression templates

We present a C++ library, TLoops, which uses a hierarchy of expression t...

Egocentric Videoconferencing

We introduce a method for egocentric videoconferencing that enables hand...

Please sign up or login with your details

Forgot password? Click here to reset