Heavy-tailed distribution for combining dependent p-values with asymptotic robustness

by   Yusi Fang, et al.

The issue of combining individual p-values to aggregate multiple small effects is prevalent in many scientific investigations and is a long-standing statistical topic. Many classical methods are designed for combining independent and frequent signals in a traditional meta-analysis sense using the sum of transformed p-values with the transformation of light-tailed distributions, in which Fisher's method and Stouffer's method are the most well-known. Since the early 2000, advances in big data promoted methods to aggregate independent, sparse and weak signals, such as the renowned higher criticism and Berk-Jones tests. Recently, Liu and Xie(2020) and Wilson(2019) independently proposed Cauchy and harmonic mean combination tests to robustly combine p-values under "arbitrary" dependency structure, where a notable application is to combine p-values from a set of often correlated SNPs in genome-wide association studies. The proposed tests are the transformation of heavy-tailed distributions for improved power with the sparse signal. It calls for a natural question to investigate heavy-tailed distribution transformation, to understand the connection among existing methods, and to explore the conditions for a method to possess robustness to dependency. In this paper, we investigate the regularly varying distribution, which is a rich family of heavy-tailed distribution and includes Pareto distribution as a special case. We show that only an equivalent class of Cauchy and harmonic mean tests have sufficient robustness to dependency in a practical sense. We also show an issue caused by large negative penalty in the Cauchy method and propose a simple, yet practical modification. Finally, we present simulations and apply to a neuroticism GWAS application to verify the discovered theoretical insights and provide practical guidance.


page 15

page 29


The Lévy combination test

A novel class of methods for combining p-values to perform aggregate hyp...

Cauchy combination test: a powerful test with analytic p-value calculation under arbitrary dependency structures

Combining individual p-values to aggregate multiple small effects has a ...

Spliced Binned-Pareto Distribution for Robust Modeling of Heavy-tailed Time Series

This work proposes a novel method to robustly and accurately model time ...

Cauchy Combination Test for Sparse Signals

Aggregating multiple effects is often encountered in large-scale data an...

On p-value combination of independent and frequent signals: asymptotic efficiency and Fisher ensemble

Combining p-values to integrate multiple effects is of long-standing int...

Pareto GAN: Extending the Representational Power of GANs to Heavy-Tailed Distributions

Generative adversarial networks (GANs) are often billed as "universal di...

Optimal Randomness in Swarm-based Search

Swarm-based search has been a hot topic for a long time. Among all the p...

Please sign up or login with your details

Forgot password? Click here to reset