Question

I would like to calculate the number of periods that have elapsed since the 200 period high of a univariate time series. For example, here's the closing price of SPY:

require(quantmod)
getSymbols("SPY",from='01-01-1900')
Data <- Cl(SPY)

Now, I can find the 200-period highs of this series using the Lag function in quantmod:

periodHigh <- function(x,n) {
    Lags <- Lag(x,1:n)
    High <- x == apply(Lags,1,max)
    x[High]
}
periodHigh(Data, 200)

But now I'm stuck. How do I merge this back onto the original series (Data) and calculate, for each point in the series, how many periods have elapsed since the previous n-period high?

Was it helpful?

Solution 3

I edited the code from the previous answers such that they are functions that take the same inputs (a univariate time series) and return the same output (a vector of days since the last n-day high):

daysSinceHigh1 <- function(x,n) {
    as.vector(n-rollapply(x, n, which.max))
}

daysSinceHigh2 <- function(x, n){
    apply(embed(x, n), 1, which.max)-1
}

The second function seems to be the fastest, but they're providing slightly different results:

> getSymbols("^GSPC",from='01-01-1900')
[1] "GSPC"
> system.time(x <- daysSinceHigh1(Cl(GSPC), 200))
   user  system elapsed 
   0.42    0.00    0.42 
> system.time(y <- daysSinceHigh2(Cl(GSPC), 200))
   user  system elapsed 
   0.24    0.00    0.24 
> all.equal(x,y)
[1] "Mean relative difference: 0.005025126"

Upon closer inspection, it appears that there are some weird edge cases in the 1st function:

data <- c(1,2,3,4,5,6,7,7,6,5,6,7,8,5,4,3,2,1)
answer <- c(0,0,0,0,1,2,3,0,0,1,2,3,4,4)
x <- daysSinceHigh1(data, 5)
y <- daysSinceHigh2(data, 5)

> x
 [1] 0 0 0 1 2 3 4 4 0 1 2 3 4 4
> y
 [1] 0 0 0 0 1 2 3 0 0 1 2 3 4 4
> answer
 [1] 0 0 0 0 1 2 3 0 0 1 2 3 4 4
> all.equal(x,answer)
[1] "Mean relative difference: 0.5714286"
> all.equal(y,answer)
[1] TRUE

Therefore, it seems like the second function (based off Andrie's code) is better.

OTHER TIPS

This little function returns a list with:

  • high the index number of high dates
  • recentHigh the index number of the most recent high day
  • daysSince the number of days since the last high
  • data an xts object with only the high days. Useful for plotting.

The code:

daysSinceHigh <- function(data, days){
  highs <- days-1+which(apply(embed(data, days), 1, which.max)==1)
  recentHigh <- max(highs)
  daysSince <- nrow(data) - recentHigh
  list(
    highs=highs,
    recentHigh = recentHigh,
    daysSince = daysSince,
    data=data[highs, ])
}       

The results:

daysSinceHigh(Data, 200)$daysSince
[1] 90

plot(Data)
points(daysSinceHigh(Data, 200)$data, col="red")

enter image description here

The answer to your revised question:

require(zoo)
x <- sample(300:500, 1000, replace=TRUE)
str(rollapply(x, 200, function(x) which.max(x)))
# int [1:801] 14 13 12 11 10 9 8 7 6 5 ...
 plot(x)
 plot(200:1000, rollapply(x, 200, function(x) 200-which.max(x)))

So for the xts series:

plot( rollapply(coredata(Data), 200, function(x) 200-which.max(x)))

enter image description here You obviously cannot merge anything back to the first 200 dates unless you apply a looser definition of rolling maximum. (In another SO session involving "shifty" data I showed how to use embed to pad the "trailing" periods: Data transformation in R but I don't know if you want to construct matrices that are 200 times as large as your input data.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top