Question

I built a kmeans cluster where I first normalized several of the variables in R. The model provides me with cluster centers, but they are obviously in their normalized state (like the center of income is -1.6).

I want to convert that -1.6 back into a non-normalized value to be able to give it a practical meaning (like income is 42,000).

Now I can individually convert that z-score back into a value, but is there way to do this with several normalized variables with a R function?

I can start with pnorm() to get the percentage- but looking for something more that I can apply back to the original dataframe before I normalized it.

Was it helpful?

Solution 2

It might be easiest to just calculate the means of the (raw) data once you have the cluster assignments. For example, using plyr:

# install.packages('plyr')
require(plyr)
dat <- mtcars[,1:4]
dat$cvar <- kmeans(scale(dat), 3)$cluster
ddply(dat, c("cvar"), colwise(mean))

  cvar      mpg      cyl     disp        hp
1    1 13.41429 8.000000 390.5714 248.42857
2    2 23.97222 4.777778 135.5389  98.05556
3    3 16.78571 8.000000 315.6286 170.00000

OTHER TIPS

You need the standard deviation and mean of the original data. If you have those the denormalization is simply x = std*z + m, where std and m are the standard deviation and mean of x. The equation follows directly from the definition of z-score.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top