Question

I have a set of user recommandations

review=matrix(c(5:1,10,2,1,1,2), nrow=5, ncol=2, dimnames=list(NULL,c("Star","Votes")))

and wanted to use summary(review) to show basic properties mean, median, quartiles and min max.

But it gives back the summary of both columns. I refrain from using data.frame because the factors 'Star' are ordered. How can I tell R that Star is a ordered list of factors numeric score and votes are their frequency?

Was it helpful?

Solution

I'm not exactly sure what you mean by taking the mean in general if Star is supposed to be an ordered factor. However, in the example you give where Star is actually a set of numeric values, you can use the following:

library(Hmisc)

R> review=matrix(c(5:1,10,2,1,1,2), nrow=5, ncol=2, dimnames=list(NULL,c("Star","Votes")))

R> wtd.mean(review[, 1], weights = review[, 2])
[1] 4.0625

R> wtd.quantile(review[, 1], weights = review[, 2])
  0%  25%  50%  75% 100% 
1.00 3.75 5.00 5.00 5.00 

OTHER TIPS

I don't understand what's the problem. Why shouldn't you use data.frame?

rv <- data.frame(star = ordered(review[, 1]), votes = review[, 2])

You should convert your data.frame to vector:

( vts <- with(rv, rep(star, votes)) )
 [1] 5 5 5 5 5 5 5 5 5 5 4 4 3 2 1 1
Levels: 1 < 2 < 3 < 4 < 5

Then do the summary... I just don't know what kind of summary, since summary will bring you back to the start. O_o

summary(vts)
 1  2  3  4  5 
 2  1  1  2 10 

EDIT (on @Prasad's suggestion)

Since vts is an ordered factor, you should convert it to numeric, hence calculate the summary (at this moment I will disregard the background statistical issues):

nvts <- as.numeric(levels(vts)[vts])  ## numeric conversion
summary(nvts)  ## "ordinary" summary
fivenum(nvts)  ## Tukey's five number summary

Just to clarify -- when you say you would like "mean, median, quartiles and min/max", you're talking in terms of number of stars? e.g mean = 4.062 stars? Then using aL3xa's code, would something like summary(as.numeric(as.character(vts))) be what you want?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top