Insert values of row r into row (r+1) and insert 1 into the first row for multiple columns in a data.table


  •  16-07-2023
  •  | 


Given a data.table and a vector indicating multiple target columns: What is the most efficient way to substitute target columns' values in row 1 by 1 and in row r by their values of (r-1) plus 1?

The whole operation should be repeated by a key called id1.

The original data.table and the target columns would look like this


DT <- data.table(id1=c(1,1,1,2,2,2), id2=c(1,2,3,1,2,3), c1=c(0,1,0,2,1,2), c2=c(0,0,1,1,2,3), c3=c(1,2,2,1,1,1))
cnames <- c("c1","c2","c3")

#    id1 id2 c1 c2 c3
# 1:   1   1  0  0  1
# 2:   1   2  1  0  2
# 3:   1   3  0  1  2
# 4:   2   1  2  1  1
# 5:   2   2  1  2  1
# 6:   2   3  2  3  1

This is the desired result

   # id1 id2 c1 c2 c3
# 1:   1   1  1  1  1      #substituted by 1
# 2:   1   2  1  1  2      # previous row + 1
# 3:   1   3  2  1  3      #        "
# 4:   2   1  1  1  1      # substituted by 1
# 5:   2   2  3  2  2      # previous row + 1
# 6:   2   2  2  3  2      #        "

I know that something like DT[,"c1" := c(1,c1[.I-1]+1), by=id1] might work, but this poses two challenges: First, the first value of c1[.I-1] is not defined. And second, the substitution using this code would be performed for one clumn (here: "c1"), whereas I need the substitution to be performed for many columns, indicated in the vector "cnames".

Thanks! Jana

È stato utile?


The easiest way is to first set, for each group, the first row to all 0's. Then, add 1 to each column. This is equivalent of what you're wishing to do. Here's how I'd do it:

setkey(DT, id1)
DT[J(unique(id1)), c(cnames) := list(0L), mult="first"]
DT[, c(cnames) := .SD+1L, .SDcols=cnames]

#    id1 id2 c1 c2 c3
# 1:   1   1  1  1  1
# 2:   1   2  2  1  3
# 3:   1   3  1  2  3
# 4:   2   1  1  1  1
# 5:   2   2  2  3  2
# 6:   2   3  3  4  2

Following OP's comment and edit to the question:

You can accomplish this as follows: First shift the rows by 1 column while replacing first column by 0's and then add 1 to all the columns.

DT[, c(cnames) := lapply(.SD, function(x) 
            c(0L, head(x, -1L))), by=id1, .SDcols=cnames]
DT[, c(cnames) := .SD+1L, .SDcols=cnames]

> DT
#    id1 id2 c1 c2 c3
# 1:   1   1  1  1  1
# 2:   1   2  1  1  2
# 3:   1   3  2  1  3
# 4:   2   1  1  1  1
# 5:   2   2  3  2  2
# 6:   2   3  2  3  2

Another variation by looking at the question in your comment:

First shift the entire data by 1 row, without grouping, and add 1 to it. Then, set first row for each group to all 1's.

setkey(DT, id1)
DT[2:nrow(DT), c(cnames) := head(DT[, ..cnames], -1L) + 1L]
DT[J(unique(id1)), c(cnames) := list(1L), mult="first"]
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top