Generative chemical transformer: attention makes neural machine learn molecular geometric structures via text
Chemical formula is an artificial language that expresses molecules as text. Neural machines that have learned chemical language can be used as a tool for inverse molecular design. Here, we propose a neural machine that creates molecules that meet some desired conditions based on a deep understanding of chemical language (generative chemical Transformer, GCT). Attention-mechanism in GCT allows a deeper understanding of molecular structures, beyond the limitations of chemical language itself that cause semantic discontinuity, by paying attention to characters sparsely. We investigate the significance of language models to inverse molecular design problems by quantitatively evaluating the quality of generated molecules. GCT generates highly realistic chemical strings that satisfy both a chemical rule and grammars of a language. Molecules parsed from generated strings simultaneously satisfy the multiple target properties and are various for a single condition set. GCT generates de novo molecules, and this is done in a short time that human experts cannot. These advances will contribute to improving the quality of human life by accelerating the process of desired material discovery.
READ FULL TEXT