Gaussian process regression for survival time prediction with genome-wide gene expression

by   Aaron J. Molstad, et al.

Predicting the survival time of a cancer patient based on his/her genome-wide gene expression remains a challenging problem. For certain types of cancer, the effects of gene expression on survival are both weak and abundant, so identifying nonzero effects with reasonable accuracy is difficult. As an alternative to methods that use variable selection, we propose a Gaussian process accelerated failure time model to predict survival time using genome-wide or pathway-wide gene expression data. Using a Monte Carlo EM algorithm, we jointly impute censored log-survival time and estimate model parameters. We demonstrate the performance of our method and its advantage over existing methods in both simulations and real data analysis. The real data that we analyze were collected from 513 patients with kidney renal clear cell carcinoma and include survival time, demographic/clinical variables, and expression of more than 20,000 genes. Our method is widely applicable as it can accommodate right, left, and interval censored outcomes; and provides a natural way to combine multiple types of high-dimensional -omics data. An R package implementing our method is available online.


A Hierarchical Spike-and-Slab Model for Pan-Cancer Survival Using Pan-Omic Data

Pan-omics, pan-cancer analysis has advanced our understanding of the mol...

Predicting phenotypes from microarrays using amplified, initially marginal, eigenvector regression

Motivation: The discovery of relationships between gene expression measu...

Identifying cancer subtypes in glioblastoma by combining genomic, transcriptomic and epigenomic data

We present a nonparametric Bayesian method for disease subtype discovery...

A Framework for Mediation Analysis with Multiple Exposures, Multivariate Mediators, and Non-Linear Response Models

Mediation analysis seeks to identify and quantify the paths by which an ...

An Enhanced MA Plot with R-Shiny to Ease Exploratory Analysis of Transcriptomic Data

MA plots are used to analyze the genome-wide differences in gene express...

Gene Expression based Survival Prediction for Cancer Patients: A Topic Modeling Approach

Cancer is one of the leading cause of death, worldwide. Many believe tha...

SurvODE: Extrapolating Gene Expression Distribution for Early Cancer Identification

With the increasingly available large-scale cancer genomics datasets, ma...

Please sign up or login with your details

Forgot password? Click here to reset