Estimating Gaussian graphical models of multi-study data with Multi-Study Factor Analysis
Network models are powerful tools for gaining new insights from complex biological data. Most lines of investigation in biology involve comparing datasets in the setting where the same predictors are measured across multiple studies or conditions (multi-study data). Consequently, the development of statistical tools for network modeling of multi-study data is a highly active area of research. Multi-study factor analysis (MSFA) is a method for estimation of latent variables (factors) in multi-study data. In this work, we generalize MSFA by adding the capacity to estimate Gaussian graphical models (GGMs). Our new tool, MSFA-X, is a framework for latent variable-based graphical modeling of shared and study-specific signals in multi-study data. We demonstrate through simulation that MSFA-X can recover shared and study-specific GGMs and outperforms a graphical lasso benchmark. We apply MSFA-X to analyze maternal response to an oral glucose tolerance test in targeted metabolomic profiles from the Hyperglycemia and Adverse Pregnancy Outcomes (HAPO) Study, identifying network-level differences in glucose metabolism between women with and without gestational diabetes mellitus.
READ FULL TEXT