Curve Your Attention: Mixed-Curvature Transformers for Graph Representation Learning

by   Sungjun Cho, et al.

Real-world graphs naturally exhibit hierarchical or cyclical structures that are unfit for the typical Euclidean space. While there exist graph neural networks that leverage hyperbolic or spherical spaces to learn representations that embed such structures more accurately, these methods are confined under the message-passing paradigm, making the models vulnerable against side-effects such as oversmoothing and oversquashing. More recent work have proposed global attention-based graph Transformers that can easily model long-range interactions, but their extensions towards non-Euclidean geometry are yet unexplored. To bridge this gap, we propose Fully Product-Stereographic Transformer, a generalization of Transformers towards operating entirely on the product of constant curvature spaces. When combined with tokenized graph Transformers, our model can learn the curvature appropriate for the input graph in an end-to-end fashion, without the need of additional tuning on different curvature initializations. We also provide a kernelized approach to non-Euclidean attention, which enables our model to run in time and memory cost linear to the number of nodes and edges while respecting the underlying geometry. Experiments on graph reconstruction and node classification demonstrate the benefits of generalizing Transformers to the non-Euclidean domain.


page 1

page 2

page 3

page 4


Constant Curvature Graph Convolutional Networks

Interest has been rising lately towards methods representing data in non...

ACE-HGNN: Adaptive Curvature Exploration Hyperbolic Graph Neural Network

Graph Neural Networks (GNNs) have been widely studied in various graph d...

e3nn: Euclidean Neural Networks

We present e3nn, a generalized framework for creating E(3) equivariant t...

AGFormer: Efficient Graph Representation with Anchor-Graph Transformer

To alleviate the local receptive issue of GCN, Transformers have been ex...

Switch Spaces: Learning Product Spaces with Sparse Gating

Learning embedding spaces of suitable geometry is critical for represent...

Mitigating Over-Smoothing and Over-Squashing using Augmentations of Forman-Ricci Curvature

While Graph Neural Networks (GNNs) have been successfully leveraged for ...

Multiresolution Graph Transformers and Wavelet Positional Encoding for Learning Hierarchical Structures

Contemporary graph learning algorithms are not well-defined for large mo...

Please sign up or login with your details

Forgot password? Click here to reset