Applying Semi-Automated Hyperparameter Tuning for Clustering Algorithms

by   Elizabeth Ditton, et al.
James Cook University

When approaching a clustering problem, choosing the right clustering algorithm and parameters is essential, as each clustering algorithm is proficient at finding clusters of a particular nature. Due to the unsupervised nature of clustering algorithms, there are no ground truth values available for empirical evaluation, which makes automation of the parameter selection process through hyperparameter tuning difficult. Previous approaches to hyperparameter tuning for clustering algorithms have relied on internal metrics, which are often biased towards certain algorithms, or having some ground truth labels available, moving the problem into the semi-supervised space. This preliminary study proposes a framework for semi-automated hyperparameter tuning of clustering problems, using a grid search to develop a series of graphs and easy to interpret metrics that can then be used for more efficient domain-specific evaluation. Preliminary results show that internal metrics are unable to capture the semantic quality of the clusters developed and approaches driven by internal metrics would come to different conclusions than those driven by manual evaluation.


page 1

page 2

page 3


Automatic Clustering for Unsupervised Risk Diagnosis of Vehicle Driving for Smart Road

Early risk diagnosis and driving anomaly detection from vehicle stream a...

Multi-Source Unsupervised Hyperparameter Optimization

How can we conduct efficient hyperparameter optimization for a completel...

CRAD: Clustering with Robust Autocuts and Depth

We develop a new density-based clustering algorithm named CRAD which is ...

Semi-Supervised Information-Maximization Clustering

Semi-supervised clustering aims to introduce prior knowledge in the deci...

A Framework for Cluster and Classifier Evaluation in the Absence of Reference Labels

In some problem spaces, the high cost of obtaining ground truth labels n...

Off-the-grid: Fast and Effective Hyperparameter Search for Kernel Clustering

Kernel functions are a powerful tool to enhance the k-means clustering a...

Clustering with Fast, Automated and Reproducible assessment applied to longitudinal neural tracking

Across many areas, from neural tracking to database entity resolution, m...

Please sign up or login with your details

Forgot password? Click here to reset