Non-stationary Gaussian process discriminant analysis with variable selection for high-dimensional functional data

09/29/2021
by   S Wade, et al.
0

High-dimensional classification and feature selection tasks are ubiquitous with the recent advancement in data acquisition technology. In several application areas such as biology, genomics and proteomics, the data are often functional in their nature and exhibit a degree of roughness and non-stationarity. These structures pose additional challenges to commonly used methods that rely mainly on a two-stage approach performing variable selection and classification separately. We propose in this work a novel Gaussian process discriminant analysis (GPDA) that combines these steps in a unified framework. Our model is a two-layer non-stationary Gaussian process coupled with an Ising prior to identify differentially-distributed locations. Scalable inference is achieved via developing a variational scheme that exploits advances in the use of sparse inverse covariance matrices. We demonstrate the performance of our methodology on simulated datasets and two proteomics datasets: breast cancer and SARS-CoV-2. Our approach distinguishes itself by offering explainability as well as uncertainty quantification in addition to low computational cost, which are crucial to increase trust and social acceptance of data-driven tools.

READ FULL TEXT
research
09/08/2023

Generalized Variable Selection Algorithms for Gaussian Process Models by LASSO-like Penalty

With the rapid development of modern technology, massive amounts of data...
research
12/10/2018

Variational Nonparametric Discriminant Analysis

Variable selection and classification methods are common objectives in t...
research
07/18/2017

Latent Gaussian Process Regression

We introduce Latent Gaussian Process Regression which is a latent variab...
research
09/27/2017

Gaussian process modelling using UQLab

We introduce the Gaussian process modelling module of the UQLab software...
research
12/29/2022

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Genomic Data

Rapid advancements in collection and dissemination of multi-platform mol...
research
08/25/2020

Variable selection for Gaussian process regression through a sparse projection

This paper presents a new variable selection approach integrated with Ga...
research
07/11/2021

Rank-based Bayesian variable selection for genome-wide transcriptomic analyses

Variable selection is crucial in high-dimensional omics-based analyses, ...

Please sign up or login with your details

Forgot password? Click here to reset