Multiple two-sample testing under arbitrary covariance dependency with an application in imaging mass spectrometry

by   Vladimir Vutov, et al.

Large-scale hypothesis testing has become a ubiquitous problem in high-dimensional statistical inference, with broad applications in various scienfitic disciplines. One relevant application is constituted by imaging mass spectrometry (IMS) association studies, where a large number of tests are performed simultaneously in order to identify molecular masses that are associated with a particular phenotype, e. g., a cancer subtype. Mass spectra obtained from Matrix-assisted laser desorption/ionization (MALDI) experiments are dependent, when considered as statistical quantities. False discovery proportion (FDP) control under arbitrary dependency structure among test statistics is an active topic in modern multiple testing research. In this context, we are concerned with the evaluation of associations between the binary outcome variable (describing the phenotype) and multiple predictors derived from MALDI measurements. We propose an inference procedure in which the correlation matrix of the test statistics is utilized. The approach is based on multiple marginal models (MMM). Specifically, we fit a marginal logistic regression model for each predictor individually. Asymptotic joint normality of the stacked vector of the marginal regression coefficients is established under standard regularity assumptions, and their (limiting) correlation matrix is estimated. The proposed method extracts common factors from the resulting empirical correlation matrix. Finally, we estimate the realized FDP of a thresholding procedure for the marginal p-values. We demonstrate a practical application of the proposed workflow to MALDI IMS data in an oncological context.


page 1

page 2

page 3

page 4


Multiple multi-sample testing under arbitrary covariance dependency

Modern high-throughput biomedical devices routinely produce data on a la...

Inferring on joint associations from marginal associations and a reference sample

We present a method to infer on joint regression coefficients obtained f...

Global and Simultaneous Hypothesis Testing for High-Dimensional Logistic Regression Models

High-dimensional logistic regression is widely used in analyzing data wi...

A Decorrelating and Debiasing Approach to Simultaneous Inference for High-Dimensional Confounded Models

Motivated by the simultaneous association analysis with the presence of ...

Asymptotic Uncertainty of False Discovery Proportion for Dependent t-Tests

Multiple testing is a fundamental problem in high-dimensional statistica...

Large-Scale Multiple Testing for Matrix-Valued Data under Double Dependency

High-dimensional inference based on matrix-valued data has drawn increas...

Asymptotic Uncertainty of False Discovery Proportion

Multiple testing has been a popular topic in statistical research. Altho...

Please sign up or login with your details

Forgot password? Click here to reset