GenAssist: Making Image Generation Accessible

07/14/2023
by   Mina Huh, et al.
0

Blind and low vision (BLV) creators use images to communicate with sighted audiences. However, creating or retrieving images is challenging for BLV creators as it is difficult to use authoring tools or assess image search results. Thus, creators limit the types of images they create or recruit sighted collaborators. While text-to-image generation models let creators generate high-fidelity images based on a text description (i.e. prompt), it is difficult to assess the content and quality of generated images. We present GenAssist, a system to make text-to-image generation accessible. Using our interface, creators can verify whether generated image candidates followed the prompt, access additional details in the image not specified in the prompt, and skim a summary of similarities and differences between image candidates. To power the interface, GenAssist uses a large language model to generate visual questions, vision-language models to extract answers, and a large language model to summarize the results. Our study with 12 BLV creators demonstrated that GenAssist enables and simplifies the process of image selection and generation, making visual authoring more accessible to all.

READ FULL TEXT

page 1

page 5

page 7

page 8

page 9

page 10

page 11

page 12

research
06/27/2021

Visual Conceptual Blending with Large-scale Language and Vision Models

We ask the question: to what extent can recent large-scale language and ...
research
09/05/2023

Breaking Barriers to Creative Expression: Co-Designing and Implementing an Accessible Text-to-Image Interface

Text-to-image generation models have grown in popularity due to their ab...
research
12/06/2022

M-VADER: A Model for Diffusion with Multimodal Context

We introduce M-VADER: a diffusion model (DM) for image generation where ...
research
07/18/2023

Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation

Research in Image Generation has recently made significant progress, par...
research
07/11/2023

TIAM – A Metric for Evaluating Alignment in Text-to-Image Generation

The progress in the generation of synthetic images has made it crucial t...
research
11/07/2022

Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

Machine learning models are now able to convert user-written text descri...
research
06/04/2023

Using artificial-intelligence tools to make LaTeX content accessible to blind readers

Screen-reader software enables blind users to access large segments of e...

Please sign up or login with your details

Forgot password? Click here to reset