Clustering with minimum spanning trees: How good can it be?

03/10/2023
by   Marek Gagolewski, et al.
0

Minimum spanning trees (MSTs) provide a convenient representation of datasets in numerous pattern recognition activities. Moreover, they are relatively fast to compute. In this paper, we quantify the extent to which they can be meaningful in data clustering tasks. By identifying the upper bounds for the agreement between the best (oracle) algorithm and the expert labels from a large battery of benchmark data, we discover that MST methods can overall be very competitive. Next, instead of proposing yet another algorithm that performs well on a limited set of examples, we review, study, extend, and generalise existing, the state-of-the-art MST-based partitioning schemes, which leads to a few new and interesting approaches. It turns out that the Genie method and the information-theoretic approaches often outperform the non-MST algorithms such as k-means, Gaussian mixtures, spectral clustering, BIRCH, and classical hierarchical agglomerative procedures.

READ FULL TEXT

page 4

page 17

page 18

research
03/15/2012

Bayesian Rose Trees

Hierarchical structure is ubiquitous in data across many domains. There ...
research
05/27/2015

New characterizations of minimum spanning trees and of saliency maps based on quasi-flat zones

We study three representations of hierarchies of partitions: dendrograms...
research
04/02/2021

Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering

This paper presents new parallel algorithms for generating Euclidean min...
research
09/09/2019

A Classification Methodology based on Subspace Graphs Learning

In this paper, we propose a design methodology for one-class classifiers...
research
12/01/2022

Clustering – Basic concepts and methods

We review clustering as an analysis tool and the underlying concepts fro...
research
05/25/2023

Differentiable Clustering with Perturbed Spanning Forests

We introduce a differentiable clustering method based on minimum-weight ...
research
12/27/2019

Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity

The development of algorithms for unsupervised pattern recognition by no...

Please sign up or login with your details

Forgot password? Click here to reset