Question

I have a large data frame with financial data that looks like this :

id      Tradedate    name          hour open    close
19897   2013-01-30   instrument1   1   18.01   13.50
19898   2013-01-30   instrument2   2   15.72    8.99
19899   2013-01-30   instrument3   3   12.80   11.42
19900   2013-01-30   instrument4   4   12.71   12.85

There are a couple thousand instruments in the above data frame. I have another "to be traded" data frame of a dozen or so instruments that looks like this:

id    name   hour 
1 instrument3 17     
2 instrument4 24    
3 instrument5 15    
4 instrument6 19

The issue I'm running into is that I cannot get subset to return a data frame which contains only the instrument-hour combinations in the "to be traded" vector. I have tried this:subset(financial_data, subset= paste(name, hour) %in% paste(to_be_traded$name, portfolio$hour)), but it returns the full financial data frame. I know that this could be accomplished in sql with something like an INNER JOIN, but I do not know how to do this in R. Any and all help is greatly appreciated.

Was it helpful?

Solution

INNER JOINs in R are accomplished with merge. There is no overlap in the unique combinations of your data matching those two columns, but if there were, they would be delivered with:

merge(dat1, dat2, by=c("name", "hour"))

If you want OUTER JOINs (left or right) they can be delivered by specifying all.x or all.y in the merge call.

OTHER TIPS

A simple merge(financial_data, trade_data) gets the job done in this case.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top