scale
is your friend here in terms of normalising to mean=0, sd=1, and if sd=1, var=1.
> mean(scale(1:10))
[1] 0
> sd(scale(1:10))
[1] 1
> var(scale(1:10))
[,1]
[1,] 1
Try some example data:
set.seed(42)
dat <- data.frame(freq=sample(1:100), scores=rnorm(100, mean=4, sd=2))
dat$bins <- cut(dat$freq, breaks=c(0, 1:10*10), include.lowest=TRUE)
Now use ave
to scale
the scores
within each of the bins
:
dat$scaled <- with(dat,ave(scores,bins,FUN=scale))
You can check the results with aggregate
or similar:
The mean
is 0 (or very close to within rounding error) in each bin.
> aggregate(scaled ~ bins, data=dat, FUN=function(x) round(mean(x), 2) )
bins scaled
1 [0,10] 0
2 (10,20] 0
3 (20,30] 0
4 (30,40] 0
5 (40,50] 0
6 (50,60] 0
7 (60,70] 0
8 (70,80] 0
9 (80,90] 0
10 (90,100] 0
The sd
is 1 in each bin:
> aggregate(scaled ~ bins, data=dat, FUN=sd)
bins scaled
1 [0,10] 1
2 (10,20] 1
3 (20,30] 1
4 (30,40] 1
5 (40,50] 1
6 (50,60] 1
7 (60,70] 1
8 (70,80] 1
9 (80,90] 1
10 (90,100] 1