Question

I have a data.table object which I expand by id based on the date variables start.time and stop.time. The code listed below works - my question is on how to make my code easier to work with.

How to avoid manually listing the existing variables of the data.table object in the statement, i.e. "start.time" and "stop.time" in test <- DT[, list(start.time, stop.time, time=seq(start.time, stop.time, by="1 month")), by=id]. Manually adding the variables is annoying if you assume that I have not only two variables but several dozens. The aim is to include all columns that exist in the original data.table object.

I tried to add .SD and setdiff(names(DT), "id") instead of the variable names, but this threw errors.

DT <- data.table(id=sample(1000), 
                 start.time=sample(seq(as.Date("01-01-2010", format="%d-%m-%Y"), as.Date("31-12-2010", format="%d-%m-%Y"), by="day"), 1000, replace=TRUE),
                 stop.time=sample(seq(as.Date("01-01-2011", format="%d-%m-%Y"), as.Date("31-12-2011", format="%d-%m-%Y"), by="day"), 1000, replace=TRUE))

# > DT
#        id start.time  stop.time
#    1: 105 2010-12-16 2011-07-30
#    2: 837 2010-01-15 2011-01-19
#    3: 155 2010-11-30 2011-10-07
#    4: 125 2010-03-31 2011-04-19
#    5: 192 2010-07-13 2011-03-27
#   ---                          
#  996: 749 2010-07-19 2011-02-18
#  997: 660 2010-06-28 2011-10-25
#  998: 808 2010-03-22 2011-12-04
#  999: 291 2010-09-03 2011-06-08
# 1000: 741 2010-01-25 2011-01-23


test <- DT[, list(start.time, stop.time, time=seq(start.time, stop.time, by="1 month")), by=id]

# > test
#         id start.time  stop.time       time
#     1: 903 2010-11-08 2011-10-22 2010-11-08
#     2: 903 2010-11-08 2011-10-22 2010-12-08
#     3: 903 2010-11-08 2011-10-22 2011-01-08
#     4: 903 2010-11-08 2011-10-22 2011-02-08
#     5: 903 2010-11-08 2011-10-22 2011-03-08
#    ---                                     
# 12346: 229 2010-09-11 2011-10-02 2011-05-11
# 12347: 229 2010-09-11 2011-10-02 2011-06-11
# 12348: 229 2010-09-11 2011-10-02 2011-07-11
# 12349: 229 2010-09-11 2011-10-02 2011-08-11
# 12350: 229 2010-09-11 2011-10-02 2011-09-1
Was it helpful?

Solution

You can use .SD like this:

DT[, c(.SD, list(time=seq(start.time, stop.time, by="1 month"))), by=id]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top