Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models

05/29/2023
by   Yuchao Gu, et al.
10

Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained significant attention from the community. These models can be easily customized for new concepts using low-rank adaptations (LoRAs). However, the utilization of multiple concept LoRAs to jointly support multiple customized concepts presents a challenge. We refer to this scenario as decentralized multi-concept customization, which involves single-client concept tuning and center-node concept fusion. In this paper, we propose a new framework called Mix-of-Show that addresses the challenges of decentralized multi-concept customization, including concept conflicts resulting from existing single-client LoRA tuning and identity loss during model fusion. Mix-of-Show adopts an embedding-decomposed LoRA (ED-LoRA) for single-client tuning and gradient fusion for the center node to preserve the in-domain essence of single concepts and support theoretically limitless concept fusion. Additionally, we introduce regionally controllable sampling, which extends spatially controllable sampling (e.g., ControlNet and T2I-Adaptor) to address attribute binding and missing object problems in multi-concept sampling. Extensive experiments demonstrate that Mix-of-Show is capable of composing multiple customized concepts with high fidelity, including characters, objects, and scenes.

READ FULL TEXT

page 2

page 5

page 7

page 8

page 9

page 17

page 18

page 19

research
04/12/2023

Continual Diffusion: Continual Customization of Text-to-Image Diffusion with C-LoRA

Recent works demonstrate a remarkable ability to customize text-to-image...
research
12/08/2022

Multi-Concept Customization of Text-to-Image Diffusion

While generative models produce high-quality images of concepts learned ...
research
03/23/2023

Ablating Concepts in Text-to-Image Diffusion Models

Large-scale text-to-image diffusion models can generate high-fidelity im...
research
02/23/2023

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Text-to-image personalization aims to teach a pre-trained diffusion mode...
research
03/09/2023

Cones: Concept Neurons in Diffusion Models for Customized Generation

Human brains respond to semantic features of presented stimuli with diff...
research
01/19/2023

Dif-Fusion: Towards High Color Fidelity in Infrared and Visible Image Fusion with Diffusion Models

Color plays an important role in human visual perception, reflecting the...
research
07/20/2023

Divide Bind Your Attention for Improved Generative Semantic Nursing

Emerging large-scale text-to-image generative models, e.g., Stable Diffu...

Please sign up or login with your details

Forgot password? Click here to reset