Making Graph Neural Networks Worth It for Low-Data Molecular Machine Learning

11/24/2020
by   Aneesh Pappu, et al.
0

Graph neural networks have become very popular for machine learning on molecules due to the expressive power of their learnt representations. However, molecular machine learning is a classically low-data regime and it isn't clear that graph neural networks can avoid overfitting in low-resource settings. In contrast, fingerprint methods are the traditional standard for low-data environments due to their reduced number of parameters and manually engineered features. In this work, we investigate whether graph neural networks are competitive in small data settings compared to the parametrically 'cheaper' alternative of fingerprint methods. When we find that they are not, we explore pretraining and the meta-learning method MAML (and variants FO-MAML and ANIL) for improving graph neural network performance by transfer learning from related tasks. We find that MAML and FO-MAML do enable the graph neural network to outperform models based on fingerprints, providing a path to using graph neural networks even in settings with severely restricted data availability. In contrast to previous work, we find ANIL performs worse that other meta-learning approaches in this molecule setting. Our results suggest two reasons: molecular machine learning tasks may require significant task-specific adaptation, and distribution shifts in test tasks relative to train tasks may contribute to worse ANIL performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2019

Machine Learning for Scent: Learning Generalizable Perceptual Representations of Small Molecules

Predicting the relationship between a molecule's structure and its odor ...
research
06/29/2021

On Graph Neural Network Ensembles for Large-Scale Molecular Property Prediction

In order to advance large-scale graph machine learning, the Open Graph B...
research
06/15/2023

On the Interplay of Subset Selection and Informed Graph Neural Networks

Machine learning techniques paired with the availability of massive data...
research
05/10/2023

Towards Scalable Adaptive Learning with Graph Neural Networks and Reinforcement Learning

Adaptive learning is an area of educational technology that consists in ...
research
09/06/2023

Using Multiple Vector Channels Improves E(n)-Equivariant Graph Neural Networks

We present a natural extension to E(n)-equivariant graph neural networks...
research
11/14/2021

Improving Compound Activity Classification via Deep Transfer and Representation Learning

Recent advances in molecular machine learning, especially deep neural ne...
research
03/29/2018

PIMKL: Pathway Induced Multiple Kernel Learning

Reliable identification of molecular biomarkers is essential for accurat...

Please sign up or login with your details

Forgot password? Click here to reset