StemP: A fast and deterministic Stem-graph approach for RNA and protein folding prediction
We propose a new deterministic methodology to predict RNA sequence and protein folding. Is stem enough for structure prediction? The main idea is to consider all possible stem formation in the given sequence. With the stem loop energy and the strength of stem, we explore how to deterministically utilize stem information for RNA sequence and protein folding structure prediction. We use graph notation, where all possible stems are represented as vertices, and co-existence as edges. This full Stem-graph presents all possible folding structure, and we pick sub-graph(s) which give the best matching energy for folding structure prediction. We introduce a Stem-Loop score to add structure information and to speed up the computation. The proposed method can handle secondary structure prediction as well as protein folding with pseudo knots. Numerical experiments are done using a laptop and results take only a few minutes or seconds. One of the strengths of this approach is in the simplicity and flexibility of the algorithm, and it gives deterministic answer. We explore protein sequences from Protein Data Bank, rRNA 5S sequences, and tRNA sequences from the Gutell Lab. Various experiments and comparisons are included to validate the propose method.
READ FULL TEXT