I like to weight cases to plot graphs in ggplot. I have a specific weight factor for each case, for instance:
value weight
2 0.34
5 0.75
6 2.31
and so on... Plotting simple grouped bars ("cross tabulation") is easy, I can use the xtabs function:
ftab <- round(xtabs(weightBy ~ varCount + varGroup),0)
When I want to plot histograms, simple bars or single box plots with weighted cases, I want to keep the distribution, so I use following function to weight the cases:
weightby <- function(var, weight) {
items <- unique(var)
newvar <- c()
for (i in 1:length(items)) {
newcount = round(sum(weight[which(var==items[i])]))
newvar <- c(newvar, rep(items[i], newcount))
}
return (newvar)
}
if (!is.null(weightBy)) {
variable <- weightby(variable, weightBy)
}
However, this function ignores the original case order, the "cases" are now numbered ascending
according to the related categories. But... If I want to plot grouped box plots, I need
a) the weighted variable with weighted counts
b) the weighted variable with weighted groups
c) the weighted means, median and quantiles within each group
How can do I do this? I have the correct weighted cross tabulation, but no weighted means from each sub group, because I cannot use the function shown above for creating tables (because of the lost correct case order).
Any hints are very appreciated!