Question

I have two data sets, AAPL and AMZN, that I wish two merge but find it difficult to do so as merge cbind fail to do as I desire it to be. I believe the issues is recognizing the data sets as data.frames but not sure.

The data looks like this:

      Date Time   Open   High    Low  Close  Volume
1 12/14/12 9:30 514.75 515.10 512.72 512.86 2504264
2 12/14/12 9:31 512.80 513.00 510.00 510.17  574498
3 12/14/12 9:32 510.04 511.70 509.11 511.26  673126
4 12/14/12 9:33 511.26 511.54 508.82 509.25  477914
5 12/14/12 9:34 509.03 510.65 508.50 510.54  432689

Desired Outcome:

    Date Time   Open   High    Low  Close Volume
12/14/12 9:30 250.11 250.64 250.07 250.37  38249
12/14/12 9:31 250.60 250.60 250.16 250.51   6954
12/14/12 9:32 250.47 250.72 250.43 250.72   3843
12/14/12 9:33 250.69 250.70 250.44 250.50   3990
12/14/12 9:34 250.46 250.64 250.21 250.31   4490

    Date Time   Open   High    Low  Close Volume
12/14/12 9:31 512.80 513.00 510.00 510.17 574498
12/14/12 9:32 510.04 511.70 509.11 511.26 673126
12/14/12 9:33 511.26 511.54 508.82 509.25 477914
12/14/12 9:34 509.03 510.65 508.50 510.54 432689

Essentially, I want to merge the two data sets by Date and Time side-by-side (I could not do it on here). I have tried converting each data set to xts but not sure if it is correct:

AAPL <- read.csv("aapl1.csv",header=TRUE)
AMZN <- read.csv("amzn1.csv",header=TRUE)
aapl <- xts(AAPL[,c(3:7)], AAPL$DATETIME <-as.POSIXct(paste(AAPL$Date,AAPL$Time), format=""%m/%d/%Y %H:%M"))
amzn <- xts(AMZN[,c(3:7)], AMZN$DATETIME <-as.POSIXct(paste(AMZN$Date,AMZN$Time), format=""%m/%d/%Y %H:%M"))

It then fails to merge when I use cbind , merge , or even join.

Was it helpful?

Solution 2

A second alternative is join() from the plyr package. It has some advanteges over merge(), but also provides less options. Would be recommendable for very large data sets because it is faster than merge().

require(plyr)
join(AAPL, AMZN, by = c("Date", "Time"))

OTHER TIPS

If your xts objects are indexed by the datetime (as they should be), simply pass the two sets to merge. Here, I'll merge a set with itself, as your question lacks example data:

data(sample_matrix)
sample.xts <- as.xts(head(sample_matrix), descr='my new xts object') # From ?xts

 merge(sample.xts, sample.xts)
##                Open     High      Low    Close   Open.1   High.1    Low.1  Close.1
## 2007-01-02 50.03978 50.11778 49.95041 50.11778 50.03978 50.11778 49.95041 50.11778
## 2007-01-03 50.23050 50.42188 50.23050 50.39767 50.23050 50.42188 50.23050 50.39767
## 2007-01-04 50.42096 50.42096 50.26414 50.33236 50.42096 50.42096 50.26414 50.33236
## 2007-01-05 50.37347 50.37347 50.22103 50.33459 50.37347 50.37347 50.22103 50.33459
## 2007-01-06 50.24433 50.24433 50.11121 50.18112 50.24433 50.24433 50.11121 50.18112
## 2007-01-07 50.13211 50.21561 49.99185 49.99185 50.13211 50.21561 49.99185 49.99185

This works because merge calls merge.xts for these data.

Here's a merge of your sample data, without using xts. First, let's read them into the interpreter:

AAPL <- read.table(header=T, text='Date Time Open High Low Close Volume
12/14/12 9:30 250.11 250.64 250.07 250.37 38249
12/14/12 9:31 250.60 250.60 250.16 250.51 6954
12/14/12 9:32 250.47 250.72 250.43 250.72 3843
12/14/12 9:33 250.69 250.70 250.44 250.50 3990
12/14/12 9:34 250.46 250.64 250.21 250.31 4490')

AMZN <- read.table(header=T, text='Date Time Open High Low Close Volume
12/14/12 9:31 512.80 513.00 510.00 510.17 574498
12/14/12 9:32 510.04 511.70 509.11 511.26 673126
12/14/12 9:33 511.26 511.54 508.82 509.25 477914
12/14/12 9:34 509.03 510.65 508.50 510.54 432689')

These are now objects of class data.frame and can be merged on the Date and Time columns:

merge(AAPL, AMZN, by=c('Date', 'Time'), all=T, suffixes = c('.AAPL', '.AMZN'))
##       Date Time Open.AAPL High.AAPL Low.AAPL Close.AAPL Volume.AAPL Open.AMZN High.AMZN Low.AMZN Close.AMZN Volume.AMZN
## 1 12/14/12 9:30    250.11    250.64   250.07     250.37       38249        NA        NA       NA         NA          NA
## 2 12/14/12 9:31    250.60    250.60   250.16     250.51        6954    512.80    513.00   510.00     510.17      574498
## 3 12/14/12 9:32    250.47    250.72   250.43     250.72        3843    510.04    511.70   509.11     511.26      673126
## 4 12/14/12 9:33    250.69    250.70   250.44     250.50        3990    511.26    511.54   508.82     509.25      477914
## 5 12/14/12 9:34    250.46    250.64   250.21     250.31        4490    509.03    510.65   508.50     510.54      432689

Converting to xts and using merge would work, once you fix a few issues in your code.

AAPL <- read.csv("aapl1.csv",header=TRUE)
AMZN <- read.csv("amzn1.csv",header=TRUE)
# your code is easier to understand if you create these columns outside of the
# xts constructor. Note that your `format` was incorrect. You need %y
# (2-digit year), not %Y (4-digit year). You also had unmatched quotes.
AAPL$DATETIME <- as.POSIXct(paste(AAPL$Date,AAPL$Time), format="%m/%d/%y %H:%M")
AMZN$DATETIME <- as.POSIXct(paste(AMZN$Date,AMZN$Time), format="%m/%d/%y %H:%M")
# create xts objects and merge
aapl <- xts(AAPL[,c(3:7)], AAPL$DATETIME)
amzn <- xts(AMZN[,c(3:7)], AMZN$DATETIME)
aapl.amzn <- merge(aapl,amzn)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top