Question

Using the following matrix of distances between 6 Italian cities:

0   662 877 255 412 996
662 0   295 468 268 400
877 295 0   754 564 138
255 468 754 0   219 869
412 268 564 219 0   669
996 400 138 869 669 0

Will R output the order of which it clustered them in: For example, single-linkage would tell you:

City 3 and City 6, followed by
City 4 and City 5, followed by
City 1 to City 4 and City 5, finally City 2 to City 3 and City 6.

It is important that I get a numeric output rather than read it off a dendrogram.

Was it helpful?

Solution

I don't know a complete solution for your problem but maybe you could use the merge value returned by hclust.

From ?hclust:

merge: an n-1 by 2 matrix. Row i of ‘merge’ describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation -j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in ‘merge’ indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons.

Your example:

d <- as.dist(read.table(textConnection("
0   662 877 255 412 996
662 0   295 468 268 400
877 295 0   754 564 138
255 468 754 0   219 869
412 268 564 219 0   669
996 400 138 869 669 0")))

hc <- hclust(d, method="single")

plot(hc)

hcplot

hc$merge

#     [,1] [,2]  # from bottom up
#[1,]   -3   -6  # City 3 and 6
#[2,]   -4   -5  # City 4 and 5
#[3,]   -1    2  # join City 1 and City 4/5
#[4,]   -2    3  # join City 2 and City 1/4/5
#[5,]    1    4  # join City 3/6 and City 1/2/4/5
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top