Domanda

In a data frame (df) i've got a variable that indicates region (a factor) and other that weights every observation. If I want to know how many observations there is in each region, I just use summary(df$region).

What I'd like to know is how can I see what it would be the size of each region considering the weights of each observation?

È stato utile?

Soluzione

You can use tapply to sum the weights by region (I think this is what you mean, but please clarify if I misunderstood):

> df <- data.frame(region=sample(levels(state.region), 200, rep=T), weight=runif(200))
> summary(df$region)
North Central     Northeast         South          West 
55            46            49            50 
> with(df, tapply(weight, region, sum))
North Central     Northeast         South          West 
27.73835      23.23487      24.71656      26.11786 

If you actually want some metric * weight, then you can just modify the tapply statement to be weight * metric instead of just weight for the first argument.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top