There might be more elegant ways to do it, but this gets the job done. The key here is the split<-
function.
df$count <- NA # This column must be added prior to calling `split<-`
# because otherwise we can't assign values to it
split(df, df$var) <- lapply(split(df, df$var), function(x){
x$count <- cumsum(sapply(1:nrow(x), function(i) x$id2[i] %in% x$id1[1:i]))
x
})
The result is the following. There are some discrepancies, so either you made some errors in your manual construction of the desired results or I have misunderstood the question.
id1 id2 var count
1 1 2 a 0
2 2 3 b 0
3 2 1 a 1
4 3 2 a 2
5 2 3 a 3
6 4 2 a 4
7 3 1 b 0
Update:
Just to make this answer complete and working, this is my take on your solution. Essentially the same, but I think it's nicer and more readable to have the ave
inside the lapply
.
df$count <- NA
split(df, df$var) <- lapply(split(df, df$var), function(x){
hit <- sapply(1:nrow(x), function(i) x$id2[i] %in% x$id1[1:i])
x$count <- ave(hit, x$id2, FUN=cumsum)
x
})