Large-scale Collaborative Imaging Genetics Studies of Risk Genetic Factors for Alzheimer's Disease Across Multiple Institutions

by   Qingyang Li, et al.

Genome-wide association studies (GWAS) offer new opportunities to identify genetic risk factors for Alzheimer's disease (AD). Recently, collaborative efforts across different institutions emerged that enhance the power of many existing techniques on individual institution data. However, a major barrier to collaborative studies of GWAS is that many institutions need to preserve individual data privacy. To address this challenge, we propose a novel distributed framework, termed Local Query Model (LQM) to detect risk SNPs for AD across multiple research institutions. To accelerate the learning process, we propose a Distributed Enhanced Dual Polytope Projection (D-EDPP) screening rule to identify irrelevant features and remove them from the optimization. To the best of our knowledge, this is the first successful run of the computationally intensive model selection procedure to learn a consistent model across different institutions without compromising their privacy while ranking the SNPs that may collectively affect AD. Empirical studies are conducted on 809 subjects with 5.9 million SNP features which are distributed across three individual institutions. D-EDPP achieved a 66-fold speed-up by effectively identifying irrelevant features.


page 1

page 2

page 3

page 4


Large-scale Feature Selection of Risk Genetic Factors for Alzheimer's Disease via Distributed Group Lasso Regression

Genome-wide association studies (GWAS) have achieved great success in th...

Handling Data Heterogeneity with Generative Replay in Collaborative Learning for Medical Imaging

Collaborative learning, which enables collaborative and decentralized tr...

Federated Generalized Linear Mixed Models for Collaborative Genome-wide Association Studies

As the sequencing costs are decreasing, there is great incentive to perf...

Alcohol Intake Differentiates AD and LATE: A Telltale Lifestyle from Two Large-Scale Datasets

Alzheimer's disease (AD), as a progressive brain disease, affects cognit...

Privacy-Preserving Distributed Deep Learning for Clinical Data

Deep learning with medical data often requires larger samples sizes than...

A Dataset and Baseline Approach for Identifying Usage States from Non-Intrusive Power Sensing With MiDAS IoT-based Sensors

The state identification problem seeks to identify power usage patterns ...

Tackling the dimensions in imaging genetics with CLUB-PLS

A major challenge in imaging genetics and similar fields is to link high...

Please sign up or login with your details

Forgot password? Click here to reset