Ensemble Method for Cluster Number Determination and Algorithm Selection in Unsupervised Learning

12/23/2021
by   Antoine Zambelli, et al.
0

Unsupervised learning, and more specifically clustering, suffers from the need for expertise in the field to be of use. Researchers must make careful and informed decisions on which algorithm to use with which set of hyperparameters for a given dataset. Additionally, researchers may need to determine the number of clusters in the dataset, which is unfortunately itself an input to most clustering algorithms. All of this before embarking on their actual subject matter work. After quantifying the impact of algorithm and hyperparameter selection, we propose an ensemble clustering framework which can be leveraged with minimal input. It can be used to determine both the number of clusters in the dataset and a suitable choice of algorithm to use for a given dataset. A code library is included in the Conclusion for ease of integration.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2018

On Hyperparameter Search in Cluster Ensembles

Quality assessments of models in unsupervised learning and clustering ve...
research
10/14/2019

DISCERN: Diversity-based Selection of Centroids for k-Estimation and Rapid Non-stochastic Clustering

As one of the most ubiquitously applied unsupervised learning methods, c...
research
07/08/2021

The Three Ensemble Clustering (3EC) Algorithm for Pattern Discovery in Unsupervised Learning

This paper presents a multiple learner algorithm called the 'Three Ensem...
research
01/23/2017

The Impact of Random Models on Clustering Similarity

Clustering is a central approach for unsupervised learning. After cluste...
research
05/22/2018

Clustering - What Both Theoreticians and Practitioners are Doing Wrong

Unsupervised learning is widely recognized as one of the most important ...
research
06/29/2018

Grapevine: A Wine Prediction Algorithm Using Multi-dimensional Clustering Methods

We present a method for a wine recommendation system that employs multid...
research
09/15/2023

Choice of trimming proportion and number of clusters in robust clustering based on trimming

So-called "classification trimmed likelihood curves" have been proposed ...

Please sign up or login with your details

Forgot password? Click here to reset