AI Chat AI Image Generator AI Video Text to Speech

Exploring Transformer Backbones for Image Diffusion Models

12/27/2022

∙

by Princy Chahal, et al.

∙

∙

We present an end-to-end Transformer based Latent Diffusion model for image synthesis. On the ImageNet class conditioned generation task we show that a Transformer based Latent Diffusion model achieves a 14.1FID which is comparable to the 13.1FID score of a UNet based architecture. In addition to showing the application of Transformer models for Diffusion based image synthesis this simplification in architecture allows easy fusion and modeling of text and image data. The multi-head attention mechanism of Transformers enables simplified interaction between the image and text features which removes the requirement for crossattention mechanism in UNet based Diffusion models.

page 2

page 5

research

∙ 12/19/2022

Scalable Diffusion Models with Transformers

We explore a new class of diffusion models based on the transformer arch...

0 William Peebles, et al. ∙

research

∙ 03/25/2023

Masked Diffusion Transformer is a Strong Image Synthesizer

Despite its success in image synthesis, we observe that diffusion probab...

0 Shanghua Gao, et al. ∙

research

∙ 07/05/2022

Array Camera Image Fusion using Physics-Aware Transformers

We demonstrate a physics-aware transformer for feature-based data fusion...

34 Qian Huang, et al. ∙

research

∙ 01/06/2022

Deep Learning Assisted End-to-End Synthesis of mm-Wave Passive Networks with 3D EM Structures: A Study on A Transformer-Based Matching Network

This paper presents a deep learning assisted synthesis approach for dire...

0 Siawpeng Er, et al. ∙

research

∙ 05/22/2023

U-DiT TTS: U-Diffusion Vision Transformer for Text-to-Speech

Deep learning has led to considerable advances in text-to-speech synthes...

0 Xin Jing, et al. ∙

research

∙ 06/01/2022

DiVAE: Photorealistic Images Synthesis with Denoising Diffusion Decoder

Recently most successful image synthesis models are multi stage process ...

0 Jie Shi, et al. ∙

research

∙ 04/25/2022

Retrieval-Augmented Diffusion Models

Generative image synthesis with diffusion models has recently achieved e...

0 Andreas Blattmann, et al. ∙