Transfer Learning Using Ensemble Neural Nets for Organic Solar Cell Screening

by   Arindam Paul, et al.

Organic Solar Cells are a promising technology for solving the clean energy crisis in the world. However, generating candidate chemical compounds for solar cells is a time-consuming process requiring thousands of hours of laboratory analysis. For a solar cell, the most important property is the power conversion efficiency which is dependent on the highest occupied molecular orbitals (HOMO) values of the donor molecules. Recently, machine learning techniques have proved to be very effective in building predictive models for HOMO values of donor structures of Organic Photovoltaic Cells (OPVs). Since experimental datasets are limited in size, current machine learning models are trained on calculations based on density functional theory (DFT). Molecular line notations such as SMILES or InChI are popular input representations for describing molecular structure of donor molecules. The two types of line representations encode different information, such as SMILES defines the bond types while InChi defines protonation. In this work, we present an ensemble deep neural network architecture, called SINet, which harnesses both the SMILES and InChI molecular representations to predict HOMO values, and leverage the potential of transfer learning from a large DFT-computed dataset- Harvard CEP to build more robust predictive models for relatively smaller HOPV datasets. Harvard CEP dataset contains molecular structures and properties for 2.3 million candidate donor structures for OPV while HOPV contains DFT-computed and experimental values of 350 and 243 molecules respectively. Our results demonstrate significant performance improvement from use of transfer learning as well as from leveraging both molecular representations.


page 1

page 3

page 6

page 7


GEOM: Energy-annotated molecular conformations for property prediction and molecular generation

Machine learning outperforms traditional approaches in many molecular de...

Space-Filling Curves as a Novel Crystal Structure Representation for Machine Learning Models

A fundamental problem in applying machine learning techniques for chemic...

Inverse Design of Potential Singlet Fission Molecules using a Transfer Learning Based Approach

Singlet fission has emerged as one of the most exciting phenomena known ...

Multi-fidelity prediction of molecular optical peaks with deep learning

Optical properties are central to molecular design for many applications...

Constant Size Molecular Descriptors For Use With Machine Learning

A set of molecular descriptors whose length is independent of molecular ...

Transferring Chemical and Energetic Knowledge Between Molecular Systems with Machine Learning

Predicting structural and energetic properties of a molecular system is ...

The Open Catalyst 2020 (OC20) Dataset and Community Challenges

Catalyst discovery and optimization is key to solving many societal and ...

Please sign up or login with your details

Forgot password? Click here to reset