BAGM: A Backdoor Attack for Manipulating Text-to-Image Generative Models

by   Jordan Vice, et al.

The rise in popularity of text-to-image generative artificial intelligence (AI) has attracted widespread public interest. At the same time, backdoor attacks are well-known in machine learning literature for their effective manipulation of neural models, which is a growing concern among practitioners. We highlight this threat for generative AI by introducing a Backdoor Attack on text-to-image Generative Models (BAGM). Our attack targets various stages of the text-to-image generative pipeline, modifying the behaviour of the embedded tokenizer and the pre-trained language and visual neural networks. Based on the penetration level, BAGM takes the form of a suite of attacks that are referred to as surface, shallow and deep attacks in this article. We compare the performance of BAGM to recently emerging related methods. We also contribute a set of quantitative metrics for assessing the performance of backdoor attacks on generative AI models in the future. The efficacy of the proposed framework is established by targeting the state-of-the-art stable diffusion pipeline in a digital marketing scenario as the target domain. To that end, we also contribute a Marketable Foods dataset of branded product images. We hope this work contributes towards exposing the contemporary generative AI security challenges and fosters discussions on preemptive efforts for addressing those challenges. Keywords: Generative Artificial Intelligence, Generative Models, Text-to-Image generation, Backdoor Attacks, Trojan, Stable Diffusion.


page 1

page 3

page 8

page 11

page 12

page 13


Rickrolling the Artist: Injecting Invisible Backdoors into Text-Guided Image Generation Models

While text-to-image synthesis currently enjoys great popularity among re...

AI Imagery and the Overton Window

AI-based text-to-image generation has undergone a significant leap in th...

What is in a Text-to-Image Prompt: The Potential of Stable Diffusion in Visual Arts Education

Text-to-Image artificial intelligence (AI) recently saw a major breakthr...

VillanDiffusion: A Unified Backdoor Attack Framework for Diffusion Models

Diffusion Models (DMs) are state-of-the-art generative models that learn...

Intriguing Properties of Diffusion Models: A Large-Scale Dataset for Evaluating Natural Attack Capability in Text-to-Image Generative Models

Denoising probabilistic diffusion models have shown breakthrough perform...

Biases in Generative Art—A Causal Look from the Lens of Art History

With rapid progress in artificial intelligence (AI), popularity of gener...

A Framework and Dataset for Abstract Art Generation via CalligraphyGAN

With the advancement of deep learning, artificial intelligence (AI) has ...

Please sign up or login with your details

Forgot password? Click here to reset