Here's a "less hacky" way to do this with base R.
set.seed(1)
myDF <- data.frame(A = runif(5), B = c("A", "A", "A", "B", "B"))
within(myDF, {
Total <- ave(A, B, FUN = sum)
Proportion <- A/Total
})
# A B Proportion Total
# 1 0.2655087 A 0.2193406 1.210486
# 2 0.3721239 A 0.3074170 1.210486
# 3 0.5728534 A 0.4732425 1.210486
# 4 0.9082078 B 0.8182865 1.109890
# 5 0.2016819 B 0.1817135 1.109890
In "dplyr" language, I guess you're looking for mutate
:
myDF %>%
group_by(B) %>%
mutate(Total = sum(A), Proportion = A/Total)
# Source: local data frame [5 x 4]
# Groups: B
#
# A B Total Proportion
# 1 0.2655087 A 1.210486 0.2193406
# 2 0.3721239 A 1.210486 0.3074170
# 3 0.5728534 A 1.210486 0.4732425
# 4 0.9082078 B 1.109890 0.8182865
# 5 0.2016819 B 1.109890 0.1817135
From the "Introduction to dplyr" vignette, you would find the following description:
As well as selecting from the set of existing columns, it's often useful to add new columns that are functions of existing columns. This is the job of
mutate()
.dplyr::mutate()
works the same way asplyr::mutate()
and similarly tobase::transform()
. The key difference betweenmutate()
andtransform()
is that mutate allows you to refer to columns that you just created.
Also, since you've tagged this "data.table", you can "chain" commands together in "data.table" quite easily to do something like:
DT <- data.table(myDF)
DT[, Total := sum(A), by = B][, Proportion := A/Total][]