Wavelet Diffusion Models are fast and scalable Image Generators

11/29/2022
by   Hao Phung, et al.
0

Diffusion models are rising as a powerful solution for high-fidelity image generation, which exceeds GANs in quality in many circumstances. However, their slow training and inference speed is a huge bottleneck, blocking them from being used in real-time applications. A recent DiffusionGAN method significantly decreases the models' running time by reducing the number of sampling steps from thousands to several, but their speeds still largely lag behind the GAN counterparts. This paper aims to reduce the speed gap by proposing a novel wavelet-based diffusion structure. We extract low-and-high frequency components from both image and feature levels via wavelet decomposition and adaptively handle these components for faster processing while maintaining good generation quality. Furthermore, we propose to use a reconstruction term, which effectively boosts the model training convergence. Experimental results on CelebA-HQ, CIFAR-10, LSUN-Church, and STL-10 datasets prove our solution is a stepping-stone to offering real-time and high-fidelity diffusion models. Our code and pre-trained checkpoints will be available at <https://github.com/VinAIResearch/WaveDiff.git>.

READ FULL TEXT

page 6

page 7

page 8

page 13

page 14

page 15

page 16

page 17

research
01/19/2023

Fast Inference in Denoising Diffusion Models via MMD Finetuning

Denoising Diffusion Models (DDMs) have become a popular tool for generat...
research
11/27/2022

Diffusion Probabilistic Model Made Slim

Despite the recent visually-pleasing results achieved, the massive compu...
research
03/30/2023

Token Merging for Fast Stable Diffusion

The landscape of image generation has been forever changed by open vocab...
research
08/30/2023

Stage-by-stage Wavelet Optimization Refinement Diffusion Model for Sparse-View CT Reconstruction

Diffusion models have emerged as potential tools to tackle the challenge...
research
10/12/2021

SDWNet: A Straight Dilated Network with Wavelet Transformation for Image Deblurring

Image deblurring is a classical computer vision problem that aims to rec...
research
09/13/2023

DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-speech Generation

In the Text-to-speech(TTS) task, the latent diffusion model has excellen...
research
06/26/2023

Restart Sampling for Improving Generative Processes

Generative processes that involve solving differential equations, such a...

Please sign up or login with your details

Forgot password? Click here to reset