Querying multiple sets of p-values through composed hypothesis testing

04/29/2021
by   Tristan Mary-Huard, et al.
0

Motivation: Combining the results of different experiments to exhibit complex patterns or to improve statistical power is a typical aim of data integration. The starting point of the statistical analysis often comes as sets of p-values resulting from previous analyses, that need to be combined in a flexible way to explore complex hypotheses, while guaranteeing a low proportion of false discoveries. Results: We introduce the generic concept of composed hypothesis, which corresponds to an arbitrary complex combination of simple hypotheses. We rephrase the problem of testing a composed hypothesis as a classification task, and show that finding items for which the composed null hypothesis is rejected boils down to fitting a mixture model and classify the items according to their posterior probabilities. We show that inference can be efficiently performed and provide a thorough classification rule to control for type I error. The performance and the usefulness of the approach are illustrated on simulations and on two different applications combining data from different types. The method is scalable, does not require any parameter tuning, and provided valuable biological insight on the considered application cases. Availability: We implement the QCH methodology in the R package hosted on CRAN.

READ FULL TEXT

page 9

page 13

research
04/27/2021

Combining independent p-values in replicability analysis: A comparative study

Given a family of null hypotheses H_1,…,H_s, we are interested in the hy...
research
08/24/2022

Online multiple hypothesis testing for reproducible research

Modern data analysis frequently involves large-scale hypothesis testing,...
research
03/29/2021

Optimal False Discovery Rate Control for Large Scale Multiple Testing with Auxiliary Information

Large-scale multiple testing is a fundamental problem in high dimensiona...
research
12/15/2019

Randomized p-values for multiple testing and their application in replicability analysis

We are concerned with testing replicability hypotheses for many endpoint...
research
07/05/2023

Federated Epidemic Surveillance

The surveillance of a pandemic is a challenging task, especially when cr...
research
01/03/2019

Instance-Based Classification through Hypothesis Testing

Classification is a fundamental problem in machine learning and data min...
research
09/30/2019

Network Differential Connectivity Analysis

Identifying differences in networks has become a canonical problem in ma...

Please sign up or login with your details

Forgot password? Click here to reset