Visual Chain-of-Thought Diffusion Models

03/28/2023
by   William Harvey, et al.
0

Recent progress with conditional image diffusion models has been stunning, and this holds true whether we are speaking about models conditioned on a text description, a scene layout, or a sketch. Unconditional image diffusion models are also improving but lag behind, as do diffusion models which are conditioned on lower-dimensional features like class labels. We propose to close the gap between conditional and unconditional models using a two-stage sampling procedure. In the first stage we sample an embedding describing the semantic content of the image. In the second stage we sample the image conditioned on this embedding and then discard the embedding. Doing so lets us leverage the power of conditional diffusion models on the unconditional generation task, which we show improves FID by 25-50 generation.

READ FULL TEXT

page 1

page 2

research
04/13/2022

Hierarchical Text-Conditional Image Generation with CLIP Latents

Contrastive models like CLIP have been shown to learn robust representat...
research
09/08/2023

MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask

Recent advancements in diffusion models have showcased their impressive ...
research
11/08/2022

Self-conditioned Embedding Diffusion for Text Generation

Can continuous diffusion models bring the same performance breakthrough ...
research
08/05/2023

Sketch and Text Guided Diffusion Model for Colored Point Cloud Generation

Diffusion probabilistic models have achieved remarkable success in text ...
research
01/31/2023

Zero3D: Semantic-Driven Multi-Category 3D Shape Generation

Semantic-driven 3D shape generation aims to generate 3D objects conditio...
research
03/21/2023

Compositional 3D Scene Generation using Locally Conditioned Diffusion

Designing complex 3D scenes has been a tedious, manual process requiring...
research
03/04/2023

Diffusion Models Generate Images Like Painters: an Analytical Theory of Outline First, Details Later

How do diffusion generative models convert pure noise into meaningful im...

Please sign up or login with your details

Forgot password? Click here to reset