Question

I'm trying to do something similar but different enough from what's described here: Update subset of data.table based on join

Specifically, I'd like to assign to matching key values (person_id is a key in both tables) column values from table control. CI is the column index. The statement below says 'with=F' was not used. when I delete those parts, it also doesn't work as expected. Any suggestions?

To rephrase: I'd like to set the subset of flatData that corresponds to control FROM control.

flatData[J(eval(control$person_id)), ci, with=F] = control[, ci, with=F]

To give a reproducible example using classic R:

x = data.frame(a = 1:3, b = 1:3, key = c('a', 'b', 'c'))
y = data.frame(a = c(2, 5), b = c(11, 2), key = c('a', 'b'))

colidx = match(c('a', 'b'), colnames(y))

x[x$key %in% y$key, colidx] = y[, colidx]

As an aside, someone please explain how to easily assign SETS of columns without using indices! Indices and data.table are a marriage made in hell.

Was it helpful?

Solution

You can use the := operator along with the join simultaneously as follows:

First prepare data:

require(data.table) ## >= 1.9.0
setDT(x)            ## converts DF to DT by reference
setDT(y)
setkey(x, key)      ## set key column
setkey(y, key)

Now the one-liner:

x[y, c("a", "b") := list(i.a, i.b)]

:= modifies by reference (in-place). The rows to modify are provided by the indices computed from the join in i.

i.a and i.b are the column names data.table internally generates for easy access to i's columns when both x and i have identical column names, when performing a join of the form x[i].

HTH

PS: In your example y's columns a and b are of type numeric and x's are of type integer and therefore you'll get a warning when run on your data, that the types dint match and therefore a coercion had to take place.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top