Question

I just spent a day debugging some R code only to find that the problem I was having was caused by a missing date in the data returned by Yahoo using getSymbol. At the time I write this Yahoo is returning this:

           QQQ.Open QQQ.High QQQ.Low QQQ.Close QQQ.Volume QQQ.Adjusted
2014-01-03    87.27    87.35   86.62     86.64   35723700        86.64
2014-01-06    86.66    86.76   86.00     86.32   32073100        86.32
2014-01-07    86.72    87.25   86.56     87.12   25860600        87.12
2014-01-08    87.14    87.55   86.95     87.31   27197400        87.31
2014-01-09    87.63    87.64   86.72     87.02   23674700        87.02
2014-01-13    87.18    87.48   85.68     86.01   48842300        86.01
2014-01-14    86.30    87.72   86.30     87.65   37178900        87.65
2014-01-15    88.03    88.54   87.94     88.37   39835600        88.37
2014-01-16    88.30    88.51   88.16     88.38   31630100        88.38
2014-01-17    88.11    88.37   87.67     87.88   36895800        87.88

which is missing 2014-01-10. That date is returned for other ETFs. I expect that Yahoo will fix the data one of these days (the data is on Google) but for now it is wrong which caused my code some fits.

To address this issue I want to check my data to ensure that there is data for all dates the markets were open. If there's a canned way to do this in some package I'd appreciate info on that but to that end I started writing some code using the timeDate package. However I have ended up with xts index questions I don't understand. The code follows:

library(timeDate)
library(quantmod)

MyZone = "UTC"
Sys.setenv(TZ = MyZone)

YearStart = "1990"
YearEnd   = "2014"
currentYear = getRmetricsOptions("currentYear")

dateStart = paste0(YearStart, "-01-01")
dateEnd   = paste0(YearEnd, "-12-31")

DayCal = timeSequence(from = dateStart,  to = dateEnd, by="day", zone = MyZone)

TradingCal = DayCal[isBizday(DayCal, holidayNYSE())]

testSym = "QQQ"
getSymbols(testSym, src="yahoo", from = dateStart, to = dateEnd)

testData = get(testSym)

head(testData)
tail(testData, n=10)

#Save date range of data being checked
firstIndex = index(testData)[1]
lastIndex  = index(testData)[nrow(testData)]

#Create an xts series covering all dates
AllDates = xts(x=rep(1, length.out=length(TradingCal)), 
            order.by=TradingCal, tzone = MyZone)

head(AllDates)
tail(AllDates)

index(AllDates)[1:20]
index(testData)[1:20]

tzone(AllDates)
tzone(testData)

#Create an xts object that has all dates covered
#by testSym but using calendar I created
CheckData = subset(AllDates, ((index(AllDates)>=firstIndex) && 
                                (index(AllDates)<=lastIndex))
                  )

class(index(AllDates))
class(index(testData))

The goal here was to create a 'known good calendar' which I could use to create a simple xts object. With that object I would then check whether every index in that object had a corresponding index in the data being tested. However I'm not getting that far as it appears my indexes are not compatible. When I run the code I get this at the end:

> CheckData = subset(AllDates, ((index(AllDates)>=firstIndex) && (index(AllDates)<=lastIndex))
+ )
Error in `>=.default`(index(AllDates), firstIndex) : 
  comparison (5) is possible only for atomic and list types
> class(index(AllDates))
[1] "timeDate"
attr(,"package")
[1] "timeDate"
> class(index(testData))
[1] "Date"
> 

Can someone show me the errors of my ways here so that I can move forward? Thanks!

Was it helpful?

Solution

You need to convert TradingCal to Date:

TradingDates <- as.Date(TradingCal)

And here's another way to find index values in TradingDates that aren't in your testData index.

AllDates <- xts(,TradingDates)
testSubset <- paste(start(testData), end(testData), sep="/")
CheckData <- merge(AllDates, testData)[testSubset]
BadDates <- CheckData[is.na(rowSums(CheckData))]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top