Learning protein conformational space by enforcing physics with convolutions and latent interpolations
Determining the different conformational states of a protein and the transition path between them is a central challenge in protein biochemistry, and is key to better understanding the relationship between biomolecular structure and function. This task is typically accomplished by sampling the protein conformational space with microseconds of molecular dynamics (MD) simulations. Despite advances in both computing hardware and enhanced sampling techniques, MD will always yield a discretized representation of this space, with transition states undersampled proportionally to their associated energy barrier. We design a convolutional neural network capable of learning a continuous and physically plausible conformational space representation, given example conformations generated by experiments and simulations. We show that this network, trained with MD simulations of two distinct protein states, can correctly predict a possible transition path between them, without any example on the transition state provided. We then show that our network, having a protein-independent architecture, can be trained in a transfer learning scenario, leading to performances superior to those of a network trained from scratch.
READ FULL TEXT