Schrödinger's Bat: Diffusion Models Sometimes Generate Polysemous Words in Superposition

by   Jennifer C. White, et al.

Recent work has shown that despite their impressive capabilities, text-to-image diffusion models such as DALL-E 2 (Ramesh et al., 2022) can display strange behaviours when a prompt contains a word with multiple possible meanings, often generating images containing both senses of the word (Rassin et al., 2022). In this work we seek to put forward a possible explanation of this phenomenon. Using the similar Stable Diffusion model (Rombach et al., 2022), we first show that when given an input that is the sum of encodings of two distinct words, the model can produce an image containing both concepts represented in the sum. We then demonstrate that the CLIP encoder used to encode prompts (Radford et al., 2021) encodes polysemous words as a superposition of meanings, and that using linear algebraic techniques we can edit these representations to influence the senses represented in the generated images. Combining these two findings, we suggest that the homonym duplication phenomenon described by Rassin et al. (2022) is caused by diffusion models producing images representing both of the meanings that are present in superposition in the encoding of a polysemous word.


page 13

page 14

page 16

page 18

page 19

page 22

page 23

page 26


Efficient Representation and Counting of Antipower Factors in Words

A k-antipower (for k > 2) is a concatenation of k pairwise distinct word...

On repetitiveness measures of Thue-Morse words

We show that the size γ(t_n) of the smallest string attractor of the nth...

Multiplicity Boost Of Transit Signal Classifiers: Validation of 69 New Exoplanets Using The Multiplicity Boost of ExoMiner

Most existing exoplanets are discovered using validation techniques rath...

Paraconsistency and Word Puzzles

Word puzzles and the problem of their representations in logic languages...

Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets

Phenomenon-specific "adversarial" datasets have been recently designed t...

Vector Space Morphology with Linear Discriminative Learning

This paper presents three case studies of modeling aspects of lexical pr...

Inducing Affective Lexical Semantics in Historical Language

The emotional connotation attached to words undergoes language change. I...

Please sign up or login with your details

Forgot password? Click here to reset