Video-to-Video Translation for Visual Speech Synthesis

05/28/2019
by   Michail C. Doukas, et al.
0

Despite remarkable success in image-to-image translation that celebrates the advancements of generative adversarial networks (GANs), very limited attempts are known for video domain translation. We study the task of video-to-video translation in the context of visual speech generation, where the goal is to transform an input video of any spoken word to an output video of a different word. This is a multi-domain translation, where each word forms a domain of videos uttering this word. Adaptation of the state-of-the-art image-to-image translation model (StarGAN) to this setting falls short with a large vocabulary size. Instead we propose to use character encodings of the words and design a novel character-based GANs architecture for video-to-video translation called Visual Speech GAN (ViSpGAN). We are the first to demonstrate video-to-video translation with a vocabulary of 500 words.

READ FULL TEXT

page 1

page 8

research
01/23/2022

Generative Adversarial Network Applications in Creating a Meta-Universe

Generative Adversarial Networks (GANs) are machine learning methods that...
research
12/04/2017

Face Translation between Images and Videos using Identity-aware CycleGAN

This paper presents a new problem of unpaired face translation between i...
research
12/07/2018

Color Constancy by GANs: An Experimental Survey

In this paper, we formulate the color constancy task as an image-to-imag...
research
08/26/2019

SliderGAN: Synthesizing Expressive Face Images by Sliding 3D Blendshape Parameters

Image-to-image (i2i) translation is the dense regression problem of lear...
research
02/12/2020

Image-to-Image Translation with Text Guidance

The goal of this paper is to embed controllable factors, i.e., natural l...
research
08/17/2021

Transferring Knowledge with Attention Distillation for Multi-Domain Image-to-Image Translation

Gradient-based attention modeling has been used widely as a way to visua...
research
08/26/2019

Mocycle-GAN: Unpaired Video-to-Video Translation

Unsupervised image-to-image translation is the task of translating an im...

Please sign up or login with your details

Forgot password? Click here to reset