Optimizing Prompts for Text-to-Image Generation

12/19/2022
by   Yaru Hao, et al.
0

Well-designed prompts can guide text-to-image models to generate amazing images. However, the performant prompts are often model-specific and misaligned with user input. Instead of laborious human engineering, we propose prompt adaptation, a general framework that automatically adapts original user input to model-preferred prompts. Specifically, we first perform supervised fine-tuning with a pretrained language model on a small collection of manually engineered prompts. Then we use reinforcement learning to explore better prompts. We define a reward function that encourages the policy to generate more aesthetically pleasing images while preserving the original user intentions. Experimental results on Stable Diffusion show that our method outperforms manual prompt engineering in terms of both automatic metrics and human preference ratings. Moreover, reinforcement learning further boosts performance, especially on out-of-domain prompts. The pretrained checkpoints are available at https://aka.ms/promptist. The demo can be found at https://aka.ms/promptist-demo.

READ FULL TEXT
research
05/25/2023

DPOK: Reinforcement Learning for Fine-tuning Text-to-Image Diffusion Models

Learning from human feedback has been shown to improve text-to-image mod...
research
11/14/2022

Interactively Learning to Summarise Timelines by Reinforcement Learning

Timeline summarisation (TLS) aims to create a time-ordered summary list ...
research
07/12/2023

T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation

Despite the stunning ability to generate high-quality images by recent t...
research
09/03/2019

Better Rewards Yield Better Summaries: Learning to Summarise Without References

Reinforcement Learning (RL) based document summarisation systems yield s...
research
02/24/2023

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis

We present multimodal conditioning modules (MCM) for enabling conditiona...
research
02/22/2023

Guiding Large Language Models via Directional Stimulus Prompting

We introduce a new framework, Directional Stimulus Prompting, that uses ...
research
02/17/2023

Grimm in Wonderland: Prompt Engineering with Midjourney to Illustrate Fairytales

The quality of text-to-image generation is continuously improving, yet t...

Please sign up or login with your details

Forgot password? Click here to reset