Question

Apologies in advance if this is answered elsewhere. I have searched for roughly 24 hrs and have come up empty at every turn.

This is the data set I am working with

Sys.setenv(TZ='GMT')
dat = read.csv("SPY_MINUTE_TRADE.csv", header = TRUE) #QuantQuote sample minute data
dat[,2] <- sprintf('%04d', dat[,2]) #add a zero to front of time IE 400 becomes 0400 aka 4AM

#Create a zoo object ordered by day and time from the dat dataframe
datzoo <- read.zoo(file=dat, sep=",", header=TRUE, 
                 index.column=1:2, format="%Y%m%d %H%M", tz="", 
                 colClasses = rep(c("character", "numeric"), c(2, 8)))

Spy <- as.xts(datzoo)

# Create regular series from 00:00 to 23:59 of 1 minute prints
y <-  xts(seq(from = 1, to = 60*24, by = 1), as.POSIXlt((0), 
    origin="2013-03-30 00:00", tz='GMT')+seq(from = 0, to = 60*60*24-1, by = 60))
colnames(y) <- "TempIndex"

#Merge the regular ts (y) with Spy and remove the original Spy column
SpyReg <- merge(y,Spy, join='left')
SpyReg$TempIndex <- NULL

#Capture the index of Spy
ISpy <- index(Spy)

I have a few questions about the above code...

1) SpyReg["2012-03-30 04:00:00 GMT"] returns

 OPEN HIGH LOW CLOSE VOLUME SPLITS EARNINGS DIVIDENDS

Spy["2012-03-30 04:00:00 GMT"] returns the correct values of Spy for the given index

                      OPEN   HIGH    LOW  CLOSE VOLUME SPLITS EARNINGS DIVIDENDS
2012-03-30 04:00:00 140.66 140.66 140.66 140.66   2160      1        0         0

However,

SpyReg["T04:00:00/T04:01:00"]
                    OPEN HIGH LOW CLOSE VOLUME SPLITS EARNINGS DIVIDENDS
2013-03-30 04:00:00   NA   NA  NA    NA     NA     NA       NA        NA
2013-03-30 04:01:00   NA   NA  NA    NA     NA     NA       NA        NA

why is this, when both are xts objects of the same index type, month, and time? Shouldn't SpyReg[""2012-03-30 04:00:00 GMT"] return:

                    OPEN HIGH LOW CLOSE VOLUME SPLITS EARNINGS DIVIDENDS
2013-03-30 04:00:00   NA   NA  NA    NA     NA     NA       NA        NA

2) Why did the merge not give SpyReg the Spy value for the same index (such as the 4AM print?) I tried all 4 "join" options, but none worked...

3) I assume there is a MUCH more elegant way to solve this problem than what I am trying to do. After creating Spy, it was not regular, minute by minute. I wanted to create a regular xts object that had no gaps and flowed continuously minute by minute from midnight to 23:59, add the entries from Spy into it, then do a na.locf to replace the rest of the NAs with the original data.

Was it helpful?

Solution

Setting the index of an xts object to POSIXlt can cause some strange behaviors. I'd simply recommend you use POSIXct instead.

URL <- "http://quantquote.com/sample/SPY_MINUTE_TRADE.csv"
Spy <- read.zoo(URL, sep=",", header=TRUE, index.column=1:2, FUN=function(x) 
    as.POSIXct(sprintf("%8d %04d",x[,1],x[,2]), format="%Y%m%d %H%M", tz=""))
Spy <- as.xts(Spy)

Now you can merge Spy with an 'empty' xts object that has the regular index values you want.

SpyReg <- merge(Spy, xts(, seq(start(Spy),end(Spy),by="1 min")), fill=na.locf)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top