TextPainter: Multimodal Text Image Generation withVisual-harmony and Text-comprehension for Poster Design

08/09/2023
by   Yifan Gao, et al.
0

Text design is one of the most critical procedures in poster design, as it relies heavily on the creativity and expertise of humans to design text images considering the visual harmony and text-semantic. This study introduces TextPainter, a novel multimodal approach that leverages contextual visual information and corresponding text semantics to generate text images. Specifically, TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony. Furthermore, we leverage the language model and introduce a text comprehension module to achieve both sentence-level and word-level style variations. Besides, we construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents. We hope this dataset will pave the way for further research on multimodal text image generation. Extensive quantitative and qualitative experiments demonstrate that TextPainter can generate visually-and-semantically-harmonious text images for posters.

READ FULL TEXT

page 2

page 3

page 5

page 6

page 7

page 8

page 10

research
10/07/2020

VICTR: Visual Information Captured Text Representation for Text-to-Image Multimodal Tasks

Text-to-image multimodal tasks, generating/retrieving an image from a gi...
research
12/06/2022

M-VADER: A Model for Diffusion with Multimodal Context

We introduce M-VADER: a diffusion model (DM) for image generation where ...
research
04/19/2022

Opal: Multimodal Image Generation for News Illustration

Multimodal AI advancements have presented people with powerful ways to c...
research
07/16/2023

Planting a SEED of Vision in Large Language Model

We present SEED, an elaborate image tokenizer that empowers Large Langua...
research
04/06/2022

Aesthetic Text Logo Synthesis via Content-aware Layout Inferring

Text logo design heavily relies on the creativity and expertise of profe...
research
12/15/2022

Are Multimodal Models Robust to Image and Text Perturbations?

Multimodal image-text models have shown remarkable performance in the pa...
research
03/10/2023

New Benchmarks for Accountable Text-based Visual Re-creation

Given a command, humans can directly execute the action after thinking o...

Please sign up or login with your details

Forgot password? Click here to reset