Quantile-based clustering

06/27/2018
by   Christian Hennig, et al.
0

A new cluster analysis method, K-quantiles clustering, is introduced. K-quantiles clustering can be computed by a simple greedy algorithm in the style of the classical Lloyd's algorithm for K-means. It can be applied to large and high-dimensional datasets. It allows for within-cluster skewness and internal variable scaling based on within-cluster variation. Different versions allow for different levels of parsimony and computational efficiency. Although K-quantiles clustering is conceived as nonparametric, it can be connected to a fixed partition model of generalized asymmetric Laplace-distributions. The consistency of K-quantiles clustering is proved, and it is shown that K-quantiles clusters correspond to well separated mixture components in a nonparametric mixture. In a simulation, K-quantiles clustering is compared with a number of popular clustering methods with good results. A high-dimensional microarray dataset is clustered by K-quantiles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/02/2018

A Fast Algorithm for Clustering High Dimensional Feature Vectors

We propose an algorithm for clustering high dimensional data. If P featu...
research
10/21/2015

Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

We propose a novel method for multiple clustering that assumes a co-clus...
research
01/01/2020

Toward Generalized Clustering through an One-Dimensional Approach

After generalizing the concept of clusters to incorporate clusters that ...
research
04/19/2015

Exploring Bayesian Models for Multi-level Clustering of Hierarchically Grouped Sequential Data

A wide range of Bayesian models have been proposed for data that is divi...
research
03/09/2020

Probabilistic Partitive Partitioning (PPP)

Clustering is a NP-hard problem. Thus, no optimal algorithm exists, heur...
research
11/21/2018

Clustering mutants favor and disfavor fixation

Exploration of the emerging patterns of mutants in a finite wild-type gr...
research
04/15/2019

Multiple kernel learning for integrative consensus clustering of genomic datasets

Diverse applications - particularly in tumour subtyping - have demonstra...

Please sign up or login with your details

Forgot password? Click here to reset