Question

I'm working on a prediction project where we have a lot cyclical features such as hour of the day, weekday, month, day of year, etc etc. After some searching I decided to follow the advice here.

Now I have the sin and cos component for every cyclical feature as a separate feature, so month becomes month_sin and month_cos. However, I don't know for sure whether the model can deal with this correlation, as both components need to be equally weighted in order for the feature to make sense. The model assigns different weights to the sin and cos components after training though. My intuition tells me that this is bad, but I'm not sure what to do about it.

Currently gbm (R) gives the best results. For a gradient boosting model, is it better to force equal weights on the two correlated features, or is it better to let the model figure it out even if it results in different weights on the two components? Or would you suggest an entirely different approach?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top