Comparison of Clustering Algorithms for Statistical Features of Vibration Data Sets

05/11/2023
by   Philipp Sepin, et al.
0

Vibration-based condition monitoring systems are receiving increasing attention due to their ability to accurately identify different conditions by capturing dynamic features over a broad frequency range. However, there is little research on clustering approaches in vibration data and the resulting solutions are often optimized for a single data set. In this work, we present an extensive comparison of the clustering algorithms K-means clustering, OPTICS, and Gaussian mixture model clustering (GMM) applied to statistical features extracted from the time and frequency domains of vibration data sets. Furthermore, we investigate the influence of feature combinations, feature selection using principal component analysis (PCA), and the specified number of clusters on the performance of the clustering algorithms. We conducted this comparison in terms of a grid search using three different benchmark data sets. Our work showed that averaging (Mean, Median) and variance-based features (Standard Deviation, Interquartile Range) performed significantly better than shape-based features (Skewness, Kurtosis). In addition, K-means outperformed GMM slightly for these data sets, whereas OPTICS performed significantly worse. We were also able to show that feature combinations as well as PCA feature selection did not result in any significant performance improvements. With an increase in the specified number of clusters, clustering algorithms performed better, although there were some specific algorithmic restrictions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/16/2016

A Sparse PCA Approach to Clustering

We discuss a clustering method for Gaussian mixture model based on the s...
research
02/22/2016

Recovering the number of clusters in data sets with noise features using feature rescaling factors

In this paper we introduce three methods for re-scaling data sets aiming...
research
02/15/2019

Unsupervised shape and motion analysis of 3822 cardiac 4D MRIs of UK Biobank

We perform unsupervised analysis of image-derived shape and motion featu...
research
10/27/2022

Clustering High-dimensional Data via Feature Selection

High-dimensional clustering analysis is a challenging problem in statist...
research
08/26/2019

An empirical comparison between stochastic and deterministic centroid initialisation for K-Means variations

K-Means is one of the most used algorithms for data clustering and the u...
research
12/12/2022

Tandem clustering with invariant coordinate selection

For high-dimensional data or data with noise variables, tandem clusterin...
research
11/15/2022

Solving clustering as ill-posed problem: experiments with K-Means algorithm

In this contribution, the clustering procedure based on K-Means algorith...

Please sign up or login with your details

Forgot password? Click here to reset