문제

I've fit a GLM (Poisson) to a data set where one of the variables is categorical for the year a customer bought a product from my company, ranging from 1999 to 2012. There's a linear trend of the coefficients for the values of the variable as the year of sale increases.

Is there any problem with trying to improve predictions for 2013 and maybe 2014 by extrapolating to get the coefficients for those years?

도움이 되었습니까?

해결책

I believe that this is a case for applying time series analysis, in particular time series forecasting (http://en.wikipedia.org/wiki/Time_series). Consider the following resources on time series regression:

다른 팁

If you suspect your response is linear with year, then put year in as a numeric term in your model rather than a categorical.

Extrapolation is then perfectly valid based on the usual assumptions of the GLM family. Make sure you correctly get the errors on your extrapolated estimates.

Just extrapolating the parameters from a categorical variable is wrong for a number of reasons. The first one I can think of is that there may be more observations in some years than others, so any linear extrapolation needs to weight those year's estimates more. Just eyeballing a line - or even fitting a line to the coefficients - won't do this.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top