The Cluster Graphical Lasso for improved estimation of Gaussian graphical models

07/19/2013
by   Kean Ming Tan, et al.
0

We consider the task of estimating a Gaussian graphical model in the high-dimensional setting. The graphical lasso, which involves maximizing the Gaussian log likelihood subject to an l1 penalty, is a well-studied approach for this task. We begin by introducing a surprising connection between the graphical lasso and hierarchical clustering: the graphical lasso in effect performs a two-step procedure, in which (1) single linkage hierarchical clustering is performed on the variables in order to identify connected components, and then (2) an l1-penalized log likelihood is maximized on the subset of variables within each connected component. In other words, the graphical lasso determines the connected components of the estimated network via single linkage clustering. Unfortunately, single linkage clustering is known to perform poorly in certain settings. Therefore, we propose the cluster graphical lasso, which involves clustering the features using an alternative to single linkage clustering, and then performing the graphical lasso on the subset of variables within each cluster. We establish model selection consistency for this technique, and demonstrate its improved performance relative to the graphical lasso in a simulation study, as well as in applications to an equities data set, a university webpage data set, and a gene expression data set.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/07/2022

Network Analysis of Count Data from Mixed Populations

In applications such as gene regulatory network analysis based on single...
research
07/08/2018

Moderated Network Models

Pairwise network models such as the Gaussian Graphical Model (GGM) are a...
research
07/16/2018

Group Invariance and Computational Sufficiency

Statistical sufficiency formalizes the notion of data reduction. In the ...
research
02/28/2014

Learning Graphical Models With Hubs

We consider the problem of learning a high-dimensional graphical model i...
research
01/10/2013

Network-based clustering with mixtures of L1-penalized Gaussian graphical models: an empirical investigation

In many applications, multivariate samples may harbor previously unrecog...
research
10/28/2019

The conditional censored graphical lasso estimator

In many applied fields, such as genomics, different types of data are co...
research
05/23/2022

Single-cell gene regulatory network analysis for mixed cell populations with applications to COVID-19 single cell data

Gene regulatory network (GRN) refers to the complex network formed by re...

Please sign up or login with your details

Forgot password? Click here to reset