fulltable[which((c(fulltable$pos[-1], NA) - fulltable$pos) > 50) + 1, new_group := 2:(.N+1)]
fulltable[is.na(new_group), new_group := 1]
fulltable[, c("lastid_new", "new_group") := list(cummax(new_group), NULL)]
R : Efficient loop on row with data.table
-
09-07-2023 - |
Question
I am using data.table in R and looping over my table, it s really slow because of my table size. I wonder if someone have any idea on
I have a set of value that I want to "cluster". Each line have a position, a positive integer. You can load a simple view of that :
library(data.table)
#Here is a toy example
fulltable=c(seq (1,4))*c(seq(1,1000,10))
fulltable=data.table(pos=fulltable[order(fulltable)])
fulltable$id=1
So I loop in my lines and When there is more than 50 between two position I change the group :
#here is the main loop
lastposition=fulltable[1]$pos
lastid=fulltable[1]$id
for(i in 2:nrow(fulltable)){
if(fulltable[i]$pos-50>lastposition){
lastid=lastid+1
print(lastid)
}
fulltable[i]$id=lastid;
lastposition=fulltable[i]$pos
}
Any idea for an effi
Solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow