Multiple Testing in Genome-Wide Association Studies via Hierarchical Hidden Markov Models

12/20/2022
by   Pengfei Wang, et al.
0

The problems of large-scale multiple testing are often encountered in modern scientific researches. Conventional multiple testing procedures usually suffer considerable loss of testing efficiency due to the lack of consideration of correlations among tests. In fact, the appropriate use of correlation information not only enhances the efficacy of multiple testing but also improves the interpretability of the results. Since the disease- or trait-related single nucleotide polymorphisms (SNPs) often tend to be clustered and exhibit serial correlations, the hidden Markov model (HMM) based multiple testing procedure has been successfully applied in genome-wide association studies (GWAS). It is important to note that modeling the entire chromosome using one HMM is somewhat rough. To overcome this issue, this paper employs the hierarchical hidden Markov model (HHMM) to describe local correlations among tests and develops a multiple testing procedure that can not only automatically divide different class of chromosome regions, but also takes into account local correlations among tests. Theoretically, it is shown that the proposed multiple testing procedure is valid and optimal in some sense. Then a data-driven procedure is developed to mimic the oracle version. Extensive simulations and the real data analysis show that the novel multiple testing procedure outperforms its competitors.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset