N-Gram Graph, A Novel Molecule Representation
Virtual high-throughput screening provides a strategy for prioritizing compounds for physical screens. Machine learning methods offer an ancillary benefit to make molecule predictions, yet the choice of representation has been challenging when selecting algorithms. We emphasize the effects of different levels of molecule representation. Then, we introduce N-gram graph, a novel representation for a molecular graph. We demonstrate that N-gram graph is able to attain most accurate prediction with several non-deep machine learning methods on multiple tasks.
READ FULL TEXT