Question

I've the following situation. A data.table that looks as follows

x = data.table(
 id1 = c('a','b','c'),
 id2 = c('x','y','x'),
 val = c(0.2,0.3,0.5))

I have two other data tables that give a mapping between the values in id1 and id2 which look like the following

id1.dt = data.table(
 id1 = c('a','a','a','b','b','c'),
 fid = c('1232','3224','3434','234','231','332')
)

and

id2.dt = data.table(
 id2 = c('x','x','y','y'),
 fid = c('334','443','344','24')
)

What I would like to be able to do is to expand out the above data.table x by preserving the values column such that I get a full cross join but by using the fid column. So the expected final table is

id1 id2 val
1232 334 0.2
1232 443 0.2
3224 334 0.2
3224 443 0.2
3434 334 0.2
3434 443 0.2
...

Basically, for each row in x I want to take all fid values of id1 and id2 from the other two tables and preserve the val value. I've tried using CJ but didn't get far with it. Any help appreciated.

Was it helpful?

Solution

A bit awkward but this should do it:

setkey(x, id1)
(setkey(x[id1.dt], id2))[
  id2.dt, allow.cartesian=T][
  order(val), -(1:2)
]

Produces:

    val  fid fid.1
 1: 0.2 1232   334
 2: 0.2 3224   334
 3: 0.2 3434   334
 4: 0.2 1232   443
 5: 0.2 3224   443
 6: 0.2 3434   443
 7: 0.3  234   344
 8: 0.3  231   344
 9: 0.3  234    24
10: 0.3  231    24
11: 0.5  332   334
12: 0.5  332   443

You can also try merge.data.table to achieve a similar result in a somewhat more intuitively graspable form:

merge(
  merge(x, id1.dt, by="id1"), 
  id2.dt, by="id2", allow.cartesian=T
)[, -(1:2)]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top