Finding Archetypal Spaces for Data Using Neural Networks

01/25/2019
by   David van Dijk, et al.
0

Archetypal analysis is a type of factor analysis where data is fit by a convex polytope whose corners are "archetypes" of the data, with the data represented as a convex combination of these archetypal points. While archetypal analysis has been used on biological data, it has not achieved widespread adoption because most data are not well fit by a convex polytope in either the ambient space or after standard data transformations. We propose a new approach to archetypal analysis. Instead of fitting a convex polytope directly on data or after a specific data transformation, we train a neural network (AAnet) to learn a transformation under which the data can best fit into a polytope. We validate this approach on synthetic data where we add nonlinearity. Here, AAnet is the only method that correctly identifies the archetypes. We also demonstrate AAnet on two biological datasets. In a T cell dataset measured with single cell RNA-sequencing, AAnet identifies several archetypal states corresponding to naive, memory, and cytotoxic T cells. In a dataset of gut microbiome profiles, AAnet recovers both previously described microbiome states and identifies novel extrema in the data. Finally, we show that AAnet has generative properties allowing us to uniformly sample from the data geometry even when the input data is not uniformly distributed.

READ FULL TEXT

page 5

page 6

page 7

page 8

research
10/27/2021

Spectrahedral Regression

Convex regression is the problem of fitting a convex function to a data ...
research
07/02/2023

Conditionally Invariant Representation Learning for Disentangling Cellular Heterogeneity

This paper presents a novel approach that leverages domain variability t...
research
05/30/2022

Unlabelled landmark matching via Bayesian data selection, and application to cell matching across imaging modalities

We consider the problem of landmark matching between two unlabelled poin...
research
06/04/2015

Spectral Learning of Large Structured HMMs for Comparative Epigenomics

We develop a latent variable model and an efficient spectral algorithm m...
research
10/07/2022

Uniformly convex neural networks and non-stationary iterated network Tikhonov (iNETT) method

We propose a non-stationary iterated network Tikhonov (iNETT) method for...
research
03/18/2021

Dynamic Kernel Matching for Non-conforming Data: A Case Study of T-cell Receptor Datasets

Most statistical classifiers are designed to find patterns in data where...
research
11/23/2021

Input Convex Gradient Networks

The gradients of convex functions are expressive models of non-trivial v...

Please sign up or login with your details

Forgot password? Click here to reset