Towards Deep Attention in Graph Neural Networks: Problems and Remedies

by   Soo Yong Lee, et al.

Graph neural networks (GNNs) learn the representation of graph-structured data, and their expressiveness can be further enhanced by inferring node relations for propagation. Attention-based GNNs infer neighbor importance to manipulate the weight of its propagation. Despite their popularity, the discussion on deep graph attention and its unique challenges has been limited. In this work, we investigate some problematic phenomena related to deep graph attention, including vulnerability to over-smoothed features and smooth cumulative attention. Through theoretical and empirical analyses, we show that various attention-based GNNs suffer from these problems. Motivated by our findings, we propose AEROGNN, a novel GNN architecture designed for deep graph attention. AERO-GNN provably mitigates the proposed problems of deep graph attention, which is further empirically demonstrated with (a) its adaptive and less smooth attention functions and (b) higher performance at deep layers (up to 64). On 9 out of 12 node classification benchmarks, AERO-GNN outperforms the baseline GNNs, highlighting the advantages of deep graph attention. Our code is available at


On Recoverability of Graph Neural Network Representations

Despite their growing popularity, graph neural networks (GNNs) still hav...

Fisher Information Embedding for Node and Graph Learning

Attention-based graph neural networks (GNNs), such as graph attention ne...

Rethinking Graph Regularization For Graph Neural Networks

The graph Laplacian regularization term is usually used in semi-supervis...

Fast Graph Attention Networks Using Effective Resistance Based Graph Sparsification

The attention mechanism has demonstrated superior performance for infere...

A Bird's-Eye Tutorial of Graph Attention Architectures

Graph Neural Networks (GNNs) have shown tremendous strides in performanc...

Between-Sample Relationship in Learning Tabular Data Using Graph and Attention Networks

Traditional machine learning assumes samples in tabular data to be indep...

Wind Park Power Prediction: Attention-Based Graph Networks and Deep Learning to Capture Wake Losses

With the increased penetration of wind energy into the power grid, it ha...

Please sign up or login with your details

Forgot password? Click here to reset