MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks

02/19/2021
by   Yuyang Wang, et al.
10

Molecular machine learning bears promise for efficient molecule property prediction and drug discovery. However, due to the limited labeled data and the giant chemical space, machine learning models trained via supervised learning perform poorly in generalization. This greatly limits the applications of machine learning methods for molecular design and discovery. In this work, we present MolCLR: Molecular Contrastive Learning of Representations via Graph Neural Networks (GNNs), a self-supervised learning framework for large unlabeled molecule datasets. Specifically, we first build a molecular graph, where each node represents an atom and each edge represents a chemical bond. A GNN is then used to encode the molecule graph. We propose three novel molecule graph augmentations: atom masking, bond deletion, and subgraph removal. A contrastive estimator is utilized to maximize the agreement of different graph augmentations from the same molecule. Experiments show that molecule representations learned by MolCLR can be transferred to multiple downstream molecular property prediction tasks. Our method thus achieves state-of-the-art performance on many challenging datasets. We also prove the efficiency of our proposed molecule graph augmentations on supervised molecular classification tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/18/2022

Improving Molecular Contrastive Learning via Faulty Negative Mitigation and Decomposed Fragment Contrast

Deep learning has been a prevalence in computational chemistry and widel...
research
08/31/2019

Gated Graph Recursive Neural Networks for Molecular Property Prediction

Molecule property prediction is a fundamental problem for computer-aided...
research
01/13/2022

Improving VAE based molecular representations for compound property prediction

Collecting labeled data for many important tasks in chemoinformatics is ...
research
11/03/2022

A 3D-Shape Similarity-based Contrastive Approach to Molecular Representation Learning

Molecular shape and geometry dictate key biophysical recognition process...
research
07/03/2023

CardiGraphormer: Unveiling the Power of Self-Supervised Learning in Revolutionizing Drug Discovery

In the expansive realm of drug discovery, with approximately 15,000 know...
research
11/14/2018

CheMixNet: Mixed DNN Architectures for Predicting Chemical Properties using Multiple Molecular Representations

SMILES is a linear representation of chemical structures which encodes t...
research
07/22/2023

Extracting Molecular Properties from Natural Language with Multimodal Contrastive Learning

Deep learning in computational biochemistry has traditionally focused on...

Please sign up or login with your details

Forgot password? Click here to reset