Question

This is the most challenging thing I have done in R so far in that both nls and LPPL are fairly new to me.

Below is a portion of script I have been working with. df is a data frame consisting of two columns, Date and Y, which are the closing prices for the S&P 500. I am not sure if it is relevant, but the dates start from 01-01-2003 through 12-31-2007.

f <- function(pars, xx) {pars$a + pars$b*(pars$tc - xx)^pars$m * 
                    (1 + pars$c * cos(pars$omega*log(pars$tc - xx) + pars$phi))} 
# residual function
resids <- function(p, observed, xx) {df$Y - f(p,xx)}
# fit using Levenberg-Marquardt algorithm
nls.out <- nls.lm(par=list(a=1,b=-1,tc=5000, m=0.5, omega=1, phi=1, c=1 ), fn = resids, 
              observed = df$Y, xx = df$days)
# use output of L-M algorithm as starting estimates in nls(...)
par <- nls.out$par

nls.final <- nls(Y~a+b*(tc-days)^m * (1 + c * cos(omega * log(tc-days) + phi)),data=df, 
             start=c(a=par$a, b=par$b, tc=par$tc, m=par$m, omega=par$omega, phi=par$phi,         c=par$c))
summary(nls.final) # display statistics of the fit 
# append fitted values to df
df$pred <- predict(nls.final)

When it runs, I receive the following message:

Error in nlsModel(formula, mf, start, wts) : 
  singular gradient matrix at initial parameter estimates
In addition: Warning messages:
1: In log(pars$tc - xx) : NaNs produced
2: In log(pars$tc - xx) : NaNs produced

The formula for LPPL can be found on the 5th screen of this pdf file, http://www.chronostraders.com/wp-content/uploads/2013/08/Research_on_LPPL.pdf

Do you know where I am going wrong? This was working correctly for a different model and I changed the code for the new equation. Credit to jlhoward for this code from this post, Using nls in R to re-create research.

Thank you for your help.

Per jlhoward's comment, df.rda can be downloaded here: https://drive.google.com/file/d/0B4xAKSwsHiEBb2lvQWR6T3NzUjA/edit?usp=sharing

Was it helpful?

Solution

First, a couple of minor things:

  1. Both nls(...) and nls.lm(...) require numeric arguments, not dates. So you have to convert somehow. I just added a days column that is the number of days since the start of your data.
  2. Your equation for F is different from Eqn. 1 in the reference, so I changed it to align.

*

f <- function(pars, xx) 
         with(pars,(a + (tc - xx)^m * (b + c * cos(omega*log(tc - xx) + phi))))

Now for the major issue: Your starting estimates are such that the LM regression fails to converge. As a result, the values in nls.out$par are not stable estimates. When you use these as the starting point for nls(...), that fails as well:

nls.out <- nls.lm(par=list(a=1,b=-1,tc=5000, m=0.5, omega=1, phi=1, c=1 ),
                  fn = resids, observed = df$Y, xx = df$days)
# Warning messages:
# 1: In log(pars$tc - xx) : NaNs produced
# 2: In log(pars$tc - xx) : NaNs produced
# ...
# 7: In nls.lm(par = list(a = 1, b = -1, tc = 5000, m = 0.5, omega = 1,  :
#   lmdif: info = -1. Number of iterations has reached `maxiter' == 50.

Generally, you should look to nls.out$status and nls.out$message to see what happened.

You have a complex model with 7 parameters. Unfortunately this leads to a situation where the regression has many local minima. Consequently, even if you provide estimates which lead to convergence, they might not be "useful". Consider:

nls.out <- nls.lm(par=list(a=1,b=1,tc=2000, m=-1, omega=1, phi=1, c=1 ), 
                  fn = resids, observed = df$Y, xx = df$days, 
                  control=nls.lm.control(maxiter=10000, ftol=1e-6, maxfev=1e6))
par <- nls.out$par
par
plot(df$Date,df$Y,type="l")
lines(df$Date,f(par,df$days))

This is a stable result (local minimum), but c is so small compared to b that the oscillations are invisible. On the other hand, these starting estimates produce a fit which matched the reference fairly closely:

nls.out <- nls.lm(par=list(a=0,b=1000,tc=2000, m=-1, omega=10, phi=1, c=200 ), 
                  fn = resids, observed = df$Y, xx = df$days, 
                  control=nls.lm.control(maxiter=10000, ftol=1e-6, maxfev=1e6))

This does produce parameter estimates which lead to convergence with nls(...), but the summary shows that the parameters are poorly estimated (only tc and omeega have p < 0.05).

nls.final <- nls(Y~a+(tc-days)^m * (b + c * cos(omega * log(tc-days) + phi)),
                 data=df, start=par, algorithm="plinear",
                 control=nls.control(maxiter=1000, minFactor=1e-8))
summary(nls.final)

Finally, using starting estimates very close the the reference (which admittedly is modeling the Great Depression, not the Great recession), gives a result which is even better:

nls.out <- nls.lm(par=list(a=600,b=-266,tc=3000, m=.5,omega=7.8,phi=-4,c=-14), 
                  fn = resids, observed = df$Y, xx = df$days, 
                  control=nls.lm.control(maxiter=10000, ftol=1e-6, maxfev=1e6))

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top