StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single Video

by   Lizhen Wang, et al.

Face reenactment methods attempt to restore and re-animate portrait videos as realistically as possible. Existing methods face a dilemma in quality versus controllability: 2D GAN-based methods achieve higher image quality but suffer in fine-grained control of facial attributes compared with 3D counterparts. In this work, we propose StyleAvatar, a real-time photo-realistic portrait avatar reconstruction method using StyleGAN-based networks, which can generate high-fidelity portrait avatars with faithful expression control. We expand the capabilities of StyleGAN by introducing a compositional representation and a sliding window augmentation method, which enable faster convergence and improve translation generalization. Specifically, we divide the portrait scenes into three parts for adaptive adjustments: facial region, non-facial foreground region, and the background. Besides, our network leverages the best of UNet, StyleGAN and time coding for video learning, which enables high-quality video generation. Furthermore, a sliding window augmentation method together with a pre-training strategy are proposed to improve translation generalization and training performance, respectively. The proposed network can converge within two hours while ensuring high image quality and a forward rendering time of only 20 milliseconds. Furthermore, we propose a real-time live system, which further pushes research into applications. Results and experiments demonstrate the superiority of our method in terms of image quality, full portrait video generation, and real-time re-animation compared to existing facial reenactment methods. Training and inference code for this paper are at


page 1

page 3

page 4

page 5

page 6

page 7

page 8


High-Quality Real Time Facial Capture Based on Single Camera

We propose a real time deep learning framework for video-based facial ex...

3DFaceNet: Real-time Dense Face Reconstruction via Synthesizing Photo-realistic Face Images

With the powerfulness of convolution neural networks (CNN), CNN based fa...

Real Time Fabric Defect Detection System on an Embedded DSP Platform

In industrial fabric productions, automated real time systems are needed...

Learning Fine-Grained Motion Embedding for Landscape Animation

In this paper we focus on landscape animation, which aims to generate ti...

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Generating talking person portraits with arbitrary speech audio is a cru...

NOFA: NeRF-based One-shot Facial Avatar Reconstruction

3D facial avatar reconstruction has been a significant research topic in...

PIRenderer: Controllable Portrait Image Generation via Semantic Neural Rendering

Generating portrait images by controlling the motions of existing faces ...

Please sign up or login with your details

Forgot password? Click here to reset