A Pipeline for Integrated Theory and Data-Driven Modeling of Genomic and Clinical Data

05/05/2020
by   Vineet K Raghu, et al.
0

High throughput genome sequencing technologies such as RNA-Seq and Microarray have the potential to transform clinical decision making and biomedical research by enabling high-throughput measurements of the genome at a granular level. However, to truly understand causes of disease and the effects of medical interventions, this data must be integrated with phenotypic, environmental, and behavioral data from individuals. Further, effective knowledge discovery methods that can infer relationships between these data types are required. In this work, we propose a pipeline for knowledge discovery from integrated genomic and clinical data. The pipeline begins with a novel variable selection method, and uses a probabilistic graphical model to understand the relationships between features in the data. We demonstrate how this pipeline can improve breast cancer outcome prediction models, and can provide a biologically interpretable view of sequencing data.

READ FULL TEXT

page 2

page 11

page 16

research
08/15/2017

Sparse Inverse Covariance Estimation for High-throughput microRNA Sequencing Data in the Poisson Log-Normal Graphical Model

We introduce the Poisson Log-Normal Graphical Model for count data, and ...
research
07/13/2021

Outcome-guided Bayesian Clustering for Disease Subtype Discovery Using High-dimensional Transcriptomic Data

The discovery of disease subtypes is an essential step for developing pr...
research
06/03/2018

Design and evaluation of a genomics variant analysis pipeline using GATK Spark tools

Scalable and efficient processing of genome sequence data, i.e. for vari...
research
09/09/2016

Nanosurveyor: a framework for real-time data processing

Scientists are drawn to synchrotrons and accelerator based light sources...
research
05/08/2020

The scalable Birth-Death MCMC Algorithm for Mixed Graphical Model Learning with Application to Genomic Data Integration

Recent advances in biological research have seen the emergence of high-t...
research
04/30/2023

Accelerating Genome Analysis via Algorithm-Architecture Co-Design

High-throughput sequencing (HTS) technologies have revolutionized the fi...
research
09/28/2020

Visual Exploration and Knowledge Discovery from Biomedical Dark Data

Data visualization techniques proffer efficient means to organize and pr...

Please sign up or login with your details

Forgot password? Click here to reset