سؤال

In a data frame (df) i've got a variable that indicates region (a factor) and other that weights every observation. If I want to know how many observations there is in each region, I just use summary(df$region).

What I'd like to know is how can I see what it would be the size of each region considering the weights of each observation?

هل كانت مفيدة؟

المحلول

You can use tapply to sum the weights by region (I think this is what you mean, but please clarify if I misunderstood):

> df <- data.frame(region=sample(levels(state.region), 200, rep=T), weight=runif(200))
> summary(df$region)
North Central     Northeast         South          West 
55            46            49            50 
> with(df, tapply(weight, region, sum))
North Central     Northeast         South          West 
27.73835      23.23487      24.71656      26.11786 

If you actually want some metric * weight, then you can just modify the tapply statement to be weight * metric instead of just weight for the first argument.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top