Machine Learning Out of test data forecast (XGBoost, ANN)

https://datascience.stackexchange.com/questions/68616

xgboost

09-12-2020
|

Question

I see a lot of applications for machine learning techniques applied to time series. Unfortunately almost all kernels with XGBoost or ANN stop short in creating an actual forecast. The achieve a great fit as they have the test data exlcuded.Are there any kernels for XGBoost or ANN where an actual forecast is created? I do not understand how this is possible as the predict function always needs future values which are not given. When trying out with the testset you have the values but after this you dont have the forecasted values of all the features, so would is the point of making such models with a lot of features which have no future values. All the models are useless?

Please prove me wrong but i do not understand.

Thank you.

Solution

It depends on how you built your model. But yeah most of the time, you will only be able to predict $y_{t+1}$ from $X_t$. If you want to predict $y_{t+2}$ from $X_t$, you have to proceed either by :

Changing your target to $y_{t+2}$ during the calibration process, this need some work as you will have to rebuild your target.
Building a model that will also predict $X_{t+1}$, which would allow you to apply the same model iteratively. One such approach would rely on some assumptions about some sort of stationarity over time.

That are the more general approaches. However I feel like this doesn't not really answer your question. The real answer to your question is that we don't really care about the level of the time serie. Most applied ML techniques only care about a y that you can define how you want. You can choose a more appropriate target you care about like : does my time series cross a given level over a given time horizon ?

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange