Question

I am looking to decompose daily sales data with a heavily seasonal component (making a 365-day seasonality that's too long for an ARIMA process). However, there are certain parts of the time series explained by other factors, including regular marketing events that affect the data. I would like to use R's stl function in a way similar to including exogenous variables in an ARIMA, but I didn't see any place to put exogenous variables into the mix. Instead, I've applied the exogenous variables to the "remainder" portion in a separate regression, but worry that the seasonality picked up by stl would be erroneous due to the said regular marketing events.

Any suggestions on how to get around this issue?

Was it helpful?

Solution

STL is a bit limited as it only handles one type of seasonality at a time, and you probably have two seasonalities (weekly and annual). Also, it does not allow for exogenous variables.

One possible approach would be to use a regression model with ARMA errors, where the seasonal period of the data was set to 7 (for the weekly seasonality). You could handle the annual seasonality with Fourier terms (http://robjhyndman.com/hyndsight/longseasonality/) as regression variables. The marketing events could be handled with dummy variables also included in the xreg argument. You can even use auto.arima from the forecast package to select the order of the error, including whether you need to allow for any weekly seasonality. Just set up the xreg with the Fourier terms and dummy variables, then call

auto.arima(y, xreg=xreg)

Handling seasonality using Fourier series assumes that it is unchanging in shape. However, unless you have many years of data, this is not really a problem as it is unlikely to have changed shape much over less than 20 years, and the ARIMA errors will adjust for small variations in any case.

If there is significant trend in the data, you should also allow for that in the regression part of the model. Adding some B-spline terms should handle it ok.

OTHER TIPS

I can recommend the forecast package on cran by Hyndman. There, the bats or tbats models should allow for both, complex seasonalities and dummy variables representing marketing events.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top