Domanda

I have 2 vectors:

a <- c(6,5,3,1,6,7,4,5,3,2)
b <- c(2,1,1,2,3,2,1,3,3,2)

I want a simple code that returns a vector composed of the means of all values in the vector "a" at the positions where there are the same values in b. Moreover I want that it is ordered the same way that the levels of b are ( levels(as.factor(b)) ).

solution = c(mean(5,3,4),mean(6,1,7,2),mean(6,5,3))

Simpler example:

a <- c(1,2,3,4)
b <- c(1,2,2,1)
solution <- c(2.5,2.5)

Thanks a lot !

È stato utile?

Soluzione

b <- factor(b, levels=c(2,1,3)) ## Sets the order of the factor's levels.
tapply(a, b, FUN=mean)
#        2        1        3 
# 4.000000 4.000000 4.666667 

Altri suggerimenti

There are several ways to achieve this. One was already mentioned by @Ananda. Some alternatives are:

aggregate(a,list(b),mean)
ddply(as.data.frame(a),.(b),summarize,mean=mean(a)) # require(plyr)
by(a,b,mean) # this is just a wrapper for tapply

The choice depends on what is your desired output format and the input format of the actual data (eg. vector vs dataframe).

the data.table solution:

library(data.table)
d = data.table(a = c(6,5,3,1,6,7,4,5,3,2), b = c(2,1,1,2,3,2,1,3,3,2))

d[, mean(a), by = b][order(b)] # (or [order(b), V1] if you just want the means)
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top