xlstat

AGGLOMERATIVE HIERARCHICAL CLUSTERING (AHC)

Use agglomerative hierarchical clustering to create similar observation groups (clusters) on the basis of their description by a set of quantitative variables, binary variables (0/1), or possibly all types of variables.

XLSTAT proposes several aggregation methods:

Ward's method (iniertia)
Ward's method (variance)
Complete linkage
Simple linkage
Strong linkage
Flexible linkage
Unweighted pair-group average
Weighted pair-group average

XLSTAT proposes several similarities/dissimilarities that are suitable for a particular type of data:

For quantitative data:

SimilarityDissimilarity
Pearson's coefficient of correlationEuclidean distance
Spearman's coefficient of rank correlationChi-square distance
Kendall's coefficient of rank correlationManhattan distance
InertiaPearson's dissimilarity
Covariance (n)Spearman's dissimilarity
Covariance (n-1)Kendall's dissimilarity
Percent agreementPercent disagreement

For binary data (0/1):

Similarity/Dissimilarity
Jaccards coefficient
Dice coefficient
Sokal & Sneath coefficient (2)
Rogers & Tanimoto coefficient
Simple matching coefficient
Indice de Sokal & Sneath coefficient (1)
Phi coefficient
Ochiais coefficient
Kulczinskis coefficient
Percent agreement

Note: for non-binary categorical variables, it is preferable to first perform a Multiple Correspondence Analysis (MCA) and to consider the coordinates of the observations on the factorial axes as new variables.