Tackling the dimensions in imaging genetics with CLUB-PLS

by   Andre Altmann, et al.

A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging measure. Although this approach has been tremendously successful, one shortcoming is that phenotypes must be pre-defined. Consequently, effects that are not confined to pre-selected regions of interest or that reflect larger brain-wide patterns can easily be missed. In this work we introduce a Partial Least Squares (PLS)-based framework, which we term Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in both domains as well as with large sample sizes. One key factor of the framework is to use cluster bootstrap to provide robust statistics for single input features in both domains. We applied CLUB-PLS to investigating the genetic basis of surface area and cortical thickness in a sample of 33,000 subjects from the UK Biobank. We found 107 genome-wide significant locus-phenotype pairs that are linked to 386 different genes. We found that a vast majority of these loci could be technically validated at a high rate: using classic GWAS or Genome-Wide Inferred Statistics (GWIS) we found that 85 locus-phenotype pairs exceeded the genome-wide suggestive (P<1e-05) threshold.


page 19

page 20

page 21


High-dimensional statistical inference for linkage disequilibrium score regression and its cross-ancestry extensions

Linkage disequilibrium score regression (LDSC) has emerged as an essenti...

A statistical framework for GWAS of high dimensional phenotypes using summary statistics, with application to metabolite GWAS

The recent explosion of genetic and high dimensional biobank and 'omic' ...

ACE of Space: Estimating Genetic Components of High-Dimensional Imaging Data

It is of great interest to quantify the contributions of genetic variati...

Genetic underpinnings of brain structural connectome for young adults

With distinct advantages in power over behavioral phenotypes, brain imag...

Cross-trait prediction accuracy of high-dimensional ridge-type estimators in genome-wide association studies

Marginal association summary statistics have attracted great attention i...

A simple genome-wide association study algorithm

A computationally simple genome-wide association study (GWAS) algorithm ...

Large-scale Collaborative Imaging Genetics Studies of Risk Genetic Factors for Alzheimer's Disease Across Multiple Institutions

Genome-wide association studies (GWAS) offer new opportunities to identi...

Please sign up or login with your details

Forgot password? Click here to reset