Using constants in input of a ML Model

https://datascience.stackexchange.com/questions/81875

14-12-2020
|

Question

I'm currently building a binary classifier.

My input is a sequence of 32 time-steps.

Certain time-steps of the input will be constant (ex: t-0 will always be 0, t-5 will always be 9, etc)

Does it make sense to add these time-steps as features into the model? Im thinking its not since: The model will have to pay attention to these features and they will add a type of noise/bias into the model - since there isnt any new information to be gained from them. Am i thinking about this correctly?

Solution

You are thinking about this correctly. If data doesn't vary between your outcomes then it doesn't need to be included.

That being said, if you are using time series techniques such as trend decomposition to feature engineer, then changing the structure of your data could complicate interpretation (ie: what is a moving average if you've removed data points?).

In that light I'd say that you should not build these quirks into the code you write, and that you should keep it as general as possible except where absolutely necessary. This is related to the concept of writing "DRY" code where you don't repeat yourself.

Personal opinion: Classifiers, like software, should try not to make assumptions about data if possible. This gives you a better chance of being able to reuse it or share it.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange