Is there a way in R to combine two tables according to matching values in a certain column? [duplicate]

StackOverflow https://stackoverflow.com/questions/22998527

  •  01-07-2023
  •  | 
  •  

Question

Say I have two dataframes, df1 and df2:

chrom   pos   genSym   type
1       4     blah     DEL
2       5     guh      INS   
1       6     poo      DEL
2       7     foo      MMP

chrom   pos   genSym   type
1       4     blah     DEL
3       3     grub     INS   
1       6     poo      INS
2       7     foo      MMP

And I'd like to combine them in such a way that the rows containing the same chrom, pos, and genSym values are paired on the same row (with duplications as needed). Rows containing chrom, pos, and genSym values not found in the other dataframe are listed unpaired, if that makes any sense. Output would ideally look kind of like this:

chrom   pos   genSym   type    chrom   pos   genSym   type 
1       4     blah     DEL     1       4     blah     DEL
2       5     guh      INS     
1       6     poo      DEL     1       6     poo      INS
2       7     foo      MMP     2       7     foo      MMP
                               3       3     grub     INS

Is there a package in R that streamlines this? If R doesn't readily do this, does anyone have suggestions for another tool?

Was it helpful?

Solution

To get what you want do this

merge(df1, df2, by = c("chrom", "pos", "genSym"), all = T)

I would also just as a learning experience, try this

merge(df, df.other,by=x,all=T)

and this is a good reference https://stackoverflow.com/a/1300618/2747709 for understanding outer, left join, right join etc and look at the other answers for sql style matching.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top