High-Dimensional Undirected Graphical Models for Arbitrary Mixed Data

11/21/2022
by   Konstantin Göbler, et al.
0

Graphical models are an important tool in exploring relationships between variables in complex, multivariate data. Methods for learning such graphical models are well developed in the case where all variables are either continuous or discrete, including in high-dimensions. However, in many applications data span variables of different types (e.g. continuous, count, binary, ordinal, etc.), whose principled joint analysis is nontrivial. Latent Gaussian copula models, in which all variables are modeled as transformations of underlying jointly Gaussian variables, represent a useful approach. Recent advances have shown how the binary-continuous case can be tackled, but the general mixed variable type regime remains challenging. In this work, we make the simple yet useful observation that classical ideas concerning polychoric and polyserial correlations can be leveraged in a latent Gaussian copula framework. Building on this observation we propose flexible and scalable methodology for data with variables of entirely general mixed type. We study the key properties of the approaches theoretically and empirically, via extensive simulations as well an illustrative application to data from the UK Biobank concerning COVID-19 risk factors.

READ FULL TEXT
research
02/14/2012

Learning mixed graphical models from data with p larger than n

Structure learning of Gaussian graphical models is an extensively studie...
research
01/10/2013

A Mixed Graphical Model for Rhythmic Parsing

A method is presented for the rhythmic parsing problem: Given a sequence...
research
11/02/2017

Bayesian latent Gaussian graphical models for mixed data with marginal prior information

Associations between variables of mixed types are of interest in a varie...
research
06/30/2015

Selective Inference and Learning Mixed Graphical Models

This thesis studies two problems in modern statistics. First, we study s...
research
09/17/2018

Rank-based approach for estimating correlations in mixed ordinal data

High-dimensional mixed data as a combination of both continuous and ordi...
research
05/13/2022

Semiparametric Gaussian Copula Regression modeling for Mixed Data Types (SGCRM)

Many clinical and epidemiological studies encode collected participant-l...
research
08/20/2021

latentcor: An R Package for estimating latent correlations from mixed data types

We present `latentcor`, an R package for correlation estimation from dat...

Please sign up or login with your details

Forgot password? Click here to reset