Question

I ran a clustering algorithm on R:

hc <- hclust(dist(data),method=”complete”)

I want to export this result to matlab (like the result of linkage) in order to compute the insconsistency. Is it possible?

Was it helpful?

Solution

From documentation of hclust, hc$merge gives the indices used to create clusters and hc$height gives the distance between the indices.

With USArrests as sample dataset:

hc<- hclust(dist(USArrests), method="complete")

data.mat<-data.matrix(data.frame(hc$merge,hc$height))
> head(data.mat)
      X1  X2 hc.height
[1,] -15 -29  2.291288
[2,] -17 -26  3.834058
[3,] -14 -16  3.929377
[4,] -13 -32  6.236986
[5,] -35 -44  6.637771
[6,] -36 -46  7.355270)

write.csv(data.mat,"data_mat.csv",col.names=FALSE,row.names=FALSE)

and in matlab after reading data into say matrix Z, the following will give the inconsistency in linkages

incons.Z = inconsistent(Z)

OR

You could compute inconsistency in R using scale function, which subtracts mean of all observations from the current observation and divides it by std. deviation (sd) i.e. normalizes it.

inconsis_scale<-as.vector(scale(data.mat[,3]))

alternately, using functions mean and sd:

inconsis_base<-(data.mat[,3]-mean(data.mat[,3]))/sd(data.mat[,3])

the two approaches yield same results which can be confirmed by

> all.equal(inconsis_scale,inconsis_base)
[1] TRUE
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top