Gophormer: Ego-Graph Transformer for Node Classification

by   Jianan Zhao, et al.

Transformers have achieved remarkable performance in a myriad of fields including natural language processing and computer vision. However, when it comes to the graph mining area, where graph neural network (GNN) has been the dominant paradigm, transformers haven't achieved competitive performance, especially on the node classification task. Existing graph transformer models typically adopt fully-connected attention mechanism on the whole input graph and thus suffer from severe scalability issues and are intractable to train in data insufficient cases. To alleviate these issues, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs. Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability and serves as an effective data augmentation technique to boost model performance. Moreover, different from the feature-based attention strategy in vanilla transformers, we propose a proximity-enhanced attention mechanism to capture the fine-grained structural bias. In order to handle the uncertainty introduced by the ego-graph sampling, we further propose a consistency regularization and a multi-sample inference strategy for stabilized training and testing, respectively. Extensive experiments on six benchmark datasets are conducted to demonstrate the superiority of Gophormer over existing graph transformers and popular GNNs, revealing the promising future of graph transformers.


page 1

page 2

page 3

page 4


Hierarchical Graph Transformer with Adaptive Node Sampling

The Transformer architecture has achieved remarkable success in a number...

Relphormer: Relational Graph Transformer for Knowledge Graph Representation

Transformers have achieved remarkable performance in widespread fields, ...

Deformable Graph Transformer

Transformer-based models have been widely used and achieved state-of-the...

Do Transformers Really Perform Bad for Graph Representation?

The Transformer architecture has become a dominant choice in many domain...

Tokenized Graph Transformer with Neighborhood Augmentation for Node Classification in Large Graphs

Graph Transformers, emerging as a new architecture for graph representat...

iMixer: hierarchical Hopfield network implies an invertible, implicit and iterative MLP-Mixer

In the last few years, the success of Transformers in computer vision ha...

A Generalization of Transformer Networks to Graphs

We propose a generalization of transformer neural network architecture f...

Please sign up or login with your details

Forgot password? Click here to reset