Robust Molecular Image Recognition: A Graph Generation Approach

05/28/2022
by   Yujie Qian, et al.
4

Molecular image recognition is a fundamental task in information extraction from chemistry literature. Previous data-driven models formulate it as an image-to-sequence task, to generate a sequential representation of the molecule (e.g. SMILES string) from its graphical representation. Although they perform adequately on certain benchmarks, these models are not robust in real-world situations, where molecular images differ in style, quality, and chemical patterns. In this paper, we propose a novel graph generation approach that explicitly predicts atoms and bonds, along with their geometric layouts, to construct the molecular graph. We develop data augmentation strategies for molecules and images to increase the robustness of our model against domain shifts. Our model is flexible to incorporate chemistry constraints, and produces more interpretable predictions than SMILES. In experiments on both synthetic and realistic molecular images, our model significantly outperforms previous models, achieving 84-93 human evaluation and show that our model reduces the time for a chemist to extract molecular structures from images by roughly 50

READ FULL TEXT

page 4

page 7

page 8

page 10

page 11

page 12

page 14

page 16

research
11/16/2022

Molecular Fingerprints for Robust and Efficient ML-Driven Molecular Generation

We propose a novel molecular fingerprint-based variational autoencoder a...
research
01/04/2023

Fragment-based t-SMILES for de novo molecular generation

At present, sequence-based and graph-based models are two of popular use...
research
03/22/2022

Root-aligned SMILES for Molecular Retrosynthesis Prediction

Retrosynthesis prediction is a fundamental problem in organic synthesis,...
research
02/14/2018

Molecular Structure Extraction From Documents Using Deep Learning

Chemical structure extraction from documents remains a hard problem due ...
research
09/06/2021

Image recognition via Vietoris-Rips complex

Extracting informative features from images has been of capital importan...
research
02/19/2022

Image-to-Graph Transformers for Chemical Structure Recognition

For several decades, chemical knowledge has been published in written te...
research
02/08/2020

Hierarchical Generation of Molecular Graphs using Structural Motifs

Graph generation techniques are increasingly being adopted for drug disc...

Please sign up or login with your details

Forgot password? Click here to reset