Question

I have a dataset X as:

customer_id event_type tot_count
931 1 5
231 2 6
231 1 3
333 3 9
444 1 1
931 3 3
333 1 21
444 2 43

I need a sum at customer_id and event_type level. This is a 1 line code in SQL as:

select customer_id, event_type, sum(tot_count) from X group by 1,2

I need the same operation in R.

Was it helpful?

Solution

You can use the aggregate function:

aggregate(tot_count ~ customer_id + event_type, X, sum)

 customer_id event_type tot_count
1         231          1         3
2         333          1        21
3         444          1         1
4         931          1         5
5         231          2         6
6         444          2        43
7         333          3         9
8         931          3         3

OTHER TIPS

For fun, here are a few more options:

Since you know SQL, sqldf

> sqldf("select customer_id, event_type, sum(tot_count) from mydf group by 1,2")
  customer_id event_type sum(tot_count)
1         231          1              3
2         231          2              6
3         333          1             21
4         333          3              9
5         444          1              1
6         444          2             43
7         931          1              5
8         931          3              3

If you have a lot of data, data.table

> library(data.table)
> DT <- data.table(mydf, key = c("customer_id", "event_type"))
> DT[, sum(tot_count), by = key(DT)]
   customer_id event_type V1
1:         231          1  3
2:         231          2  6
3:         333          1 21
4:         333          3  9
5:         444          1  1
6:         444          2 43
7:         931          1  5
8:         931          3  3
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top