DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

01/23/2023
by   Qitian Wu, et al.
0

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t. a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction.

READ FULL TEXT
research
12/19/2022

Scalable Diffusion Models with Transformers

We explore a new class of diffusion models based on the transformer arch...
research
12/28/2022

Exploring Vision Transformers as Diffusion Learners

Score-based diffusion models have captured widespread attention and fund...
research
01/23/2023

RainDiffusion:When Unsupervised Learning Meets Diffusion Models for Real-world Image Deraining

What will happen when unsupervised learning meets diffusion models for r...
research
10/01/2018

Graph Diffusion-Embedding Networks

We present a novel graph diffusion-embedding networks (GDEN) for graph s...
research
01/31/2023

Learning Data Representations with Joint Diffusion Models

We introduce a joint diffusion model that simultaneously learns meaningf...
research
10/26/2018

Learning and Interpreting Multi-Multi-Instance Learning Networks

We introduce an extension of the multi-instance learning problem where e...
research
11/09/2019

ConveRT: Efficient and Accurate Conversational Representations from Transformers

General-purpose pretrained sentence encoders such as BERT are not ideal ...

Please sign up or login with your details

Forgot password? Click here to reset