DeeperBind: Enhancing Prediction of Sequence Specificities of DNA Binding Proteins

11/17/2016
by   Hamid Reza Hassanzadeh, et al.
0

Transcription factors (TFs) are macromolecules that bind to cis-regulatory specific sub-regions of DNA promoters and initiate transcription. Finding the exact location of these binding sites (aka motifs) is important in a variety of domains such as drug design and development. To address this need, several in vivo and in vitro techniques have been developed so far that try to characterize and predict the binding specificity of a protein to different DNA loci. The major problem with these techniques is that they are not accurate enough in prediction of the binding affinity and characterization of the corresponding motifs. As a result, downstream analysis is required to uncover the locations where proteins of interest bind. Here, we propose DeeperBind, a long short term recurrent convolutional network for prediction of protein binding specificities with respect to DNA probes. DeeperBind can model the positional dynamics of probe sequences and hence reckons with the contributions made by individual sub-regions in DNA sequences, in an effective way. Moreover, it can be trained and tested on datasets containing varying-length sequences. We apply our pipeline to the datasets derived from protein binding microarrays (PBMs), an in-vitro high-throughput technology for quantification of protein-DNA binding preferences, and present promising results. To the best of our knowledge, this is the most accurate pipeline that can predict binding specificities of DNA sequences from the data produced by high-throughput technologies through utilization of the power of deep learning for feature generation and positional dynamics modeling.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/02/2021

DNA-GCN: Graph convolutional networks for predicting DNA-protein binding

Predicting DNA-protein binding is an important and classic problem in bi...
research
09/26/2020

ProDOMA: improve PROtein DOMAin classification for third-generation sequencing reads using deep learning

Motivation: With the development of third-generation sequencing technolo...
research
07/06/2019

Investigating some attributes of periodicity in DNA sequences via semi-Markov modelling

DNA segments and sequences have been studied thoroughly during the past ...
research
11/18/2018

Prediction of Signal Sequences in Abiotic Stress Inducible Genes from Main Crops by Association Rule Mining

It is important to study on genes affecting to growing environment of ma...
research
05/31/2021

Sequenceable Event Recorders

With recent high-throughput technology we can synthesize large heterogen...
research
01/18/2011

Automated Image Processing for the Analysis of DNA Repair Dynamics

The efficient repair of cellular DNA is essential for the maintenance an...
research
08/10/2022

Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning

Inverse design of short single-stranded RNA and DNA sequences (aptamers)...

Please sign up or login with your details

Forgot password? Click here to reset