Question

I've searched for an answer to this simple question, but can't find a similar question. I have 3 data tables:

set.seed(0)
demo <- data.table(id = 1:10, demo.var = rnorm(10), key = 'id'); demo
lab <- data.table(id = 1:7, tc = rnorm(7), key = 'id'); lab
anthro <- data.table(id = 4:9, bmi = rnorm(6), key = 'id'); anthro

All IDs that are in lab and anthro are in the demo data.table, but lab and anthro contain different subsets of the IDs in demo

Both

lab[demo]
anthro[demo]

give the information I want: all 10 IDs with additional information from either the lab or anthro data.table, but is there a was to merge all 3 together in a similar manner? I've tried some permutations such as

anthro[lab][demo]

but this gives the preserves the anthro information only for the IDs that are in the lab data.table - there's no anthro information for IDs 8 and 9

Thanks in advance for any help

Was it helpful?

Solution

anthro[lab[demo]]
#      id        bmi         tc     demo.var
#   1:  1         NA  0.7635935  1.262954285
#   2:  2         NA -0.7990092 -0.326233361
#   3:  3         NA -1.1476570  1.329799263
#   4:  4 -0.8919211 -0.2894616  1.272429321
#   5:  5  0.4356833 -0.2992151  0.414641434
#   6:  6 -1.2375384 -0.4115108 -1.539950042
#   7:  7 -0.2242679  0.2522234 -0.928567035
#   8:  8  0.3773956         NA -0.294720447
#   9:  9  0.1333364         NA -0.005767173
#  10: 10         NA         NA  2.404653389

The inner table is always the one the outer join is performed on, so this nesting makes sure that the table with the super set of the index values is always the inner table.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top