Bayesian hierarchical models for SNP discovery from genome-wide association studies, a semi-supervised machine learning approach

by   Yan Xu, et al.

Genome-wide association studies (GWASs) aim to detect genetic risk factors for complex human diseases by identifying disease-associated single-nucleotide polymorphisms (SNPs). SNP-wise approach, the standard method for analyzing GWAS, tests each SNP individually. Then the P-values are adjusted for multiple testing. Multiple testing adjustment (purely based on p-values) is over-conservative and causes lack of power in many GWASs, due to insufficiently modelling the relationship among SNPs. To address this problem, we propose a novel method, which borrows information across SNPs by grouping SNPs into three clusters. We pre-specify the patterns of clusters by minor allele frequencies of SNPs between cases and controls, and enforce the patterns with prior distributions. Therefore, compared with the traditional approach, it better controls false discovery rate (FDR) and shows higher sensitivity, which is confirmed by our simulation studies. We re-analyzed real data studies on identifying SNPs associated with severe bortezomib-induced peripheral neuropathy (BiPN) in patients with multiple myeloma. The original analysis in the literature failed to identify SNPs after FDR adjustment. Our proposed method not only detected the reported SNPs after FDR adjustment but also discovered a novel SNP rs4351714 that has been reported to be related to multiple myeloma in another study.


page 1

page 2

page 3

page 4


Increasing the Discovery Power and Confidence Levels of Disease Association Studies: A Survey

The majority of common diseases are influenced by multiple genetic and e...

Clustering MIC data through Bayesian mixture models: an application to detect M. Tuberculosis resistance mutations

Antimicrobial resistance is becoming a major threat to public health thr...

VIMCO: Variational Inference for Multiple Correlated Outcomes in Genome-wide Association Studies

In Genome-Wide Association Studies (GWAS) where multiple correlated trai...

Phenotyping with Positive Unlabelled Learning for Genome-Wide Association Studies

Identifying phenotypes plays an important role in furthering our underst...

Multiple Testing in Genome-Wide Association Studies via Hierarchical Hidden Markov Models

The problems of large-scale multiple testing are often encountered in mo...

A robust statistical method for Genome-wide association analysis of human copy number variation

Conducting genome-wide association studies (GWAS) in copy number variati...

Wavelet Screaming: a novel approach to analyzing GWAS data

We present an alternative method for genome-wide association studies (GW...

Please sign up or login with your details

Forgot password? Click here to reset