libmolgrid: GPU Accelerated Molecular Gridding for Deep Learning Applications

by   Jocelyn Sunseri, et al.

There are many ways to represent a molecule as input to a machine learning model and each is associated with loss and retention of certain kinds of information. In the interest of preserving three-dimensional spatial information, including bond angles and torsions, we have developed libmolgrid, a general-purpose library for representing three-dimensional molecules using multidimensional arrays. This library also provides functionality for composing batches of data suited to machine learning workflows, including data augmentation, class balancing, and example stratification according to a regression variable or data subgroup, and it further supports temporal and spatial recurrences over that data to facilitate work with recurrent neural networks, dynamical data, and size extensive modeling. It was designed for seamless integration with popular deep learning frameworks, including Caffe, PyTorch, and Keras, providing good performance by leveraging graphical processing units (GPUs) for computationally-intensive tasks and efficient memory usage through the use of memory views over preallocated buffers. libmolgrid is a free and open source project that is actively supported, serving the growing need in the molecular modeling community for tools that streamline the process of data ingestion, representation construction, and principled machine learning model development.


page 10

page 12

page 14

page 16


Deep Spatial Learning with Molecular Vibration

Machine learning over-fitting caused by data scarcity greatly limits the...

MLlib: Machine Learning in Apache Spark

Apache Spark is a popular open-source platform for large-scale data proc...

AMPL: A Data-Driven Modeling Pipeline for Drug Discovery

One of the key requirements for incorporating machine learning into the ...

What you need to know to train recurrent neural networks to make Flip Flops memories and more

Training neural networks to perform different tasks is relevant across v...

EcoRNN: Fused LSTM RNN Implementation with Data Layout Optimization

Long-Short-Term-Memory Recurrent Neural Network (LSTM RNN) is a state-of...

NeuralFMU: Towards Structural Integration of FMUs into Neural Networks

This paper covers two major subjects: First, the presentation of a new o...

ParticleGrid: Enabling Deep Learning using 3D Representation of Materials

From AlexNet to Inception, autoencoders to diffusion models, the develop...

Please sign up or login with your details

Forgot password? Click here to reset