For your first question, if they're not sorted, I'd do a setkey
on id, year
for sorting (rather than using base:::order
, as it's very slow). id
is also added so that you'll get the results in the same order as you expect for question 2 as well.
setkey(DT, id, year)
DT[, if (.N == 1L ||
( .N > 1 && all(value[2:.N]-value[1:(.N-1)] > 0) )
) .SD,
by=list(id)]
id year value
1: b 2001 1
2: b 2002 2
3: b 2003 3
4: c 2001 4
5: c 2002 5
6: c 2003 6
For your second question:
DT[, if (.N == 1L) 1L else sum(value[2:.N]-value[1:(.N-1)] > 0), by=list(id)]
id V1
1: a 1
2: b 2
3: c 2
I take the 2nd to the last (.N) value and subtract it with 1st to n-1th explicitly because diff
being a S3 generic will take time for dispatch of the right method (here, diff.default
) and it would be much faster to directly write your function in j
.