Reverse Stable Diffusion: What prompt was used to generate this image?

08/02/2023
by   Florinel-Alin Croitoru, et al.
0

Text-to-image diffusion models such as Stable Diffusion have recently attracted the interest of many researchers, and inverting the diffusion process can play an important role in better understanding the generative process and how to engineer prompts in order to obtain the desired images. To this end, we introduce the new task of predicting the text prompt given an image generated by a generative diffusion model. We combine a series of white-box and black-box models (with and without access to the weights of the diffusion network) to deal with the proposed task. We propose a novel learning framework comprising of a joint prompt regression and multi-label vocabulary classification objective that generates improved prompts. To further improve our method, we employ a curriculum learning procedure that promotes the learning of image-prompt pairs with lower labeling noise (i.e. that are better aligned), and an unsupervised domain-adaptive kernel learning method that uses the similarities between samples in the source and target domains as extra features. We conduct experiments on the DiffusionDB data set, predicting text prompts from images generated by Stable Diffusion. Our novel learning framework produces excellent results on the aforementioned task, yielding the highest gains when applied on the white-box model. In addition, we make an interesting discovery: training a diffusion model on the prompt generation task can make the model generate images that are much better aligned with the input prompts, when the model is directly reused for text-to-image generation.

READ FULL TEXT

page 9

page 15

page 16

page 17

research
02/23/2023

Controlled and Conditional Text to Image Generation with Diffusion Prior

Denoising Diffusion models have shown remarkable performance in generati...
research
09/01/2023

DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using Stable Diffusion Models

Generating high-quality labeled image datasets is crucial for training a...
research
04/28/2023

SceneGenie: Scene Graph Guided Diffusion Models for Image Synthesis

Text-conditioned image generation has made significant progress in recen...
research
11/03/2022

Evaluating a Synthetic Image Dataset Generated with Stable Diffusion

We generate synthetic images with the "Stable Diffusion" image generatio...
research
05/16/2023

A Method for Training-free Person Image Picture Generation

The current state-of-the-art Diffusion model has demonstrated excellent ...
research
07/17/2023

Unstoppable Attack: Label-Only Model Inversion via Conditional Diffusion Model

Model inversion attacks (MIAs) are aimed at recovering private data from...
research
05/31/2023

Understanding and Mitigating Copying in Diffusion Models

Images generated by diffusion models like Stable Diffusion are increasin...

Please sign up or login with your details

Forgot password? Click here to reset