Estimating a common covariance matrix for network meta-analysis of gene expression datasets in diffuse large B-cell lymphoma

by   Anders Ellern Bilgrau, et al.

The estimation of covariance matrices of gene expressions has many applications in cancer systems biology. Many gene expression studies, however, are hampered by low sample size and it has therefore become popular to increase sample size by collecting gene expression data across studies. Motivated by the traditional meta-analysis using random effects models, we present a hierarchical random covariance model and use it for the meta-analysis of gene correlation networks across 11 large-scale gene expression studies of diffuse large B-cell lymphoma (DLBCL). We suggest to use a maximum likelihood estimator for the underlying common covariance matrix and introduce an EM algorithm for estimation. By simulation experiments comparing the estimated covariance matrices by cophenetic correlation and Kullback-Leibler divergence the suggested estimator showed to perform better or not worse than a simple pooled estimator. In a posthoc analysis of the estimated common covariance matrix for the DLBCL data we were able to identify novel biologically meaningful gene correlation networks with eigengenes of prognostic value. In conclusion, the method seems to provide a generally applicable framework for meta-analysis, when multiple features are measured and believed to share a common covariance matrix obscured by study dependent noise.


page 1

page 2

page 3

page 4


Core Shrinkage Covariance Estimation for Matrix-variate Data

A separable covariance model for a random matrix provides a parsimonious...

The covariance shift (C-SHIFT) algorithm for normalizing biological data

Omics technologies are powerful tools for analyzing patterns in gene exp...

SMAGEXP: a galaxy tool suite for transcriptomics data meta-analysis

Bakground: With the proliferation of available microarray and high throu...

Fused inverse-normal method for integrated differential expression analysis of RNA-seq data

Use of next-generation sequencing technologies to transcriptomics (RNA-s...

Estimation of large block covariance matrices: Application to the analysis of gene expression data

Motivated by an application in molecular biology, we propose a novel, ef...

Covariance-based sample selection for heterogenous data: Applications to gene expression and autism risk gene detection

Risk for autism can be influenced by genetic mutations in hundreds of ge...

On a Possible Similarity between Gene and Semantic Networks

In several domains such as linguistics, molecular biology or social scie...

Please sign up or login with your details

Forgot password? Click here to reset