Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again

by   Ajay Jaiswal, et al.

Despite the enormous success of Graph Convolutional Networks (GCNs) in modeling graph-structured data, most of the current GCNs are shallow due to the notoriously challenging problems of over-smoothening and information squashing along with conventional difficulty caused by vanishing gradients and over-fitting. Previous works have been primarily focused on the study of over-smoothening and over-squashing phenomena in training deep GCNs. Surprisingly, in comparison with CNNs/RNNs, very limited attention has been given to understanding how healthy gradient flow can benefit the trainability of deep GCNs. In this paper, firstly, we provide a new perspective of gradient flow to understand the substandard performance of deep GCNs and hypothesize that by facilitating healthy gradient flow, we can significantly improve their trainability, as well as achieve state-of-the-art (SOTA) level performance from vanilla-GCNs. Next, we argue that blindly adopting the Glorot initialization for GCNs is not optimal, and derive a topology-aware isometric initialization scheme for vanilla-GCNs based on the principles of isometry. Additionally, contrary to ad-hoc addition of skip-connections, we propose to use gradient-guided dynamic rewiring of vanilla-GCNs with skip connections. Our dynamic rewiring method uses the gradient flow within each layer during training to introduce on-demand skip-connections adaptively. We provide extensive empirical evidence across multiple datasets that our methods improve gradient flow in deep vanilla-GCNs and significantly boost their performance to comfortably compete and outperform many fancy state-of-the-art methods. Codes are available at: https://github.com/VITA-Group/GradientGCN.


Deep Isometric Learning for Visual Recognition

Initialization, normalization, and skip connections are believed to be t...

Improving the Trainability of Deep Neural Networks through Layerwise Batch-Entropy Regularization

Training deep neural networks is a very demanding task, especially chall...

PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication

Graph Convolutional Networks (GCNs) is the state-of-the-art method for l...

DeepGCNs: Making GCNs Go as Deep as CNNs

Convolutional Neural Networks (CNNs) have been very successful at solvin...

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks

In recent years, state-of-the-art methods in computer vision have utiliz...

New Insights into Graph Convolutional Networks using Neural Tangent Kernels

Graph Convolutional Networks (GCNs) have emerged as powerful tools for l...

Skip Connections Eliminate Singularities

Skip connections made the training of very deep networks possible and ha...

Please sign up or login with your details

Forgot password? Click here to reset