ChemoVerse: Manifold traversal of latent spaces for novel molecule discovery
In order to design a more potent and effective chemical entity, it is essential to identify molecular structures with the desired chemical properties. Recent advances in generative models using neural networks and machine learning are being widely used by many emerging startups and researchers in this domain to design virtual libraries of drug-like compounds. Although these models can help a scientist to produce novel molecular structures rapidly, the challenge still exists in the intelligent exploration of the latent spaces of generative models, thereby reducing the randomness in the generative procedure. In this work we present a manifold traversal with heuristic search to explore the latent chemical space. Different heuristics and scores such as the Tanimoto coefficient, synthetic accessibility, binding activity, and QED drug-likeness can be incorporated to increase the validity and proximity for desired molecular properties of the generated molecules. For evaluating the manifold traversal exploration, we produce the latent chemical space using various generative models such as grammar variational autoencoders (with and without attention) as they deal with the randomized generation and validity of compounds. With this novel traversal method, we are able to find more unseen compounds and more specific regions to mine in the latent space. Finally, these components are brought together in a simple platform allowing users to perform search, visualization and selection of novel generated compounds.
READ FULL TEXT