Applicability and Interpretability of Hierarchical Agglomerative Clustering With or Without Contiguity Constraints

Hierarchical Agglomerative Classification (HAC) with Ward's linkage has been widely used since its introduction in Ward (1963). The present article reviews the different extensions of the method to various input data and the constrained framework, while providing applicability conditions. In addition, various versions of the graphical representation of the results as a dendrogram are also presented and their properties are clarified. While some of these results can sometimes be found in an heteroclite literature, we clarify and complete them all using a uniform background. In particular, this study reveals an important distinction between a consistency property of the dendrogram and the absence of crossover within it. Finally, a simulation study shows that the constrained version of HAC can sometimes provide more relevant results than its unconstrained version despite the fact that the latter optimizes the objective criterion on a reduced set of solutions at each step. Overall, the article provides comprehensive recommandations for the use of HAC and constrained HAC depending on the input data as well as for the representation of the results.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset