Estimating the Number of Clusters via Normalized Cluster Instability
We improve existing instability-based methods for the selection of the number of clusters k in cluster analysis by normalizing instability. In contrast to existing instability methods which only perform well for bounded sequences of small k, our method performs well across the whole sequence of possible k. In addition, we compare for the first time model-based and model-free variants of k selection via cluster instability and find that their performance is similar. We make our method available in the R-package +cstab+.
READ FULL TEXT