Multimodal Conditional Image Synthesis with Product-of-Experts GANs

12/09/2021
by   Xun Huang, et al.
0

Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference. They are often unable to leverage multimodal user inputs when available, which reduces their practicality. To address this limitation, we propose the Product-of-Experts Generative Adversarial Networks (PoE-GAN) framework, which can synthesize images conditioned on multiple input modalities or any subset of them, even the empty set. PoE-GAN consists of a product-of-experts generator and a multimodal multiscale projection discriminator. Through our carefully designed training scheme, PoE-GAN learns to synthesize images with high quality and diversity. Besides advancing the state of the art in multimodal conditional image synthesis, PoE-GAN also outperforms the best existing unimodal conditional image synthesis approaches when tested in the unimodal setting. The project website is available at https://deepimagination.github.io/PoE-GAN .

READ FULL TEXT

page 7

page 17

page 18

page 19

page 20

page 21

page 22

page 23

research
05/10/2023

MMoT: Mixture-of-Modality-Tokens Transformer for Composed Multimodal Conditional Image Synthesis

Existing multimodal conditional image synthesis (MCIS) methods generate ...
research
07/06/2022

Text to Image Synthesis using Stacked Conditional Variational Autoencoders and Conditional Generative Adversarial Networks

Synthesizing a realistic image from textual description is a major chall...
research
08/27/2023

Bi-Modality Medical Image Synthesis Using Semi-Supervised Sequential Generative Adversarial Networks

In this paper, we propose a bi-modality medical image synthesis approach...
research
11/25/2022

Unifying conditional and unconditional semantic image synthesis with OCO-GAN

Generative image models have been extensively studied in recent years. I...
research
04/07/2020

Multimodal Image Synthesis with Conditional Implicit Maximum Likelihood Estimation

Many tasks in computer vision and graphics fall within the framework of ...
research
09/07/2023

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

Due to the difficulty in scaling up, generative adversarial networks (GA...
research
12/27/2021

Multimodal Image Synthesis and Editing: A Survey

As information exists in various modalities in real world, effective int...

Please sign up or login with your details

Forgot password? Click here to reset