FlexIT: Towards Flexible Semantic Image Translation

03/09/2022
by   Guillaume Couairon, et al.
0

Deep generative models, like GANs, have considerably improved the state of the art in image synthesis, and are able to generate near photo-realistic images in structured domains such as human faces. Based on this success, recent work on image editing proceeds by projecting images to the GAN latent space and manipulating the latent vector. However, these approaches are limited in that only images from a narrow domain can be transformed, and with only a limited number of editing operations. We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing. Our method achieves flexible and natural editing, pushing the limits of semantic image translation. First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space. Via the latent space of an auto-encoder, we iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms. We propose an evaluation protocol for semantic image translation, and thoroughly evaluate our method on ImageNet. Code will be made publicly available.

READ FULL TEXT

page 1

page 5

page 6

page 7

page 15

page 16

page 17

research
03/31/2020

In-Domain GAN Inversion for Real Image Editing

Recent work has shown that a variety of controllable semantics emerges i...
research
01/28/2020

Controlling generative models with continuous factors of variations

Recent deep generative models are able to provide photo-realistic images...
research
06/03/2020

Nested Scale Editing for Conditional Image Synthesis

We propose an image synthesis approach that provides stratified navigati...
research
02/06/2023

Zero-shot Image-to-Image Translation

Large-scale text-to-image generative models have shown their remarkable ...
research
03/19/2021

Paint by Word

We investigate the problem of zero-shot semantic image painting. Instead...
research
04/14/2023

Delta Denoising Score

We introduce Delta Denoising Score (DDS), a novel scoring function for t...
research
12/06/2022

Domain Translation via Latent Space Mapping

In this paper, we investigate the problem of multi-domain translation: g...

Please sign up or login with your details

Forgot password? Click here to reset