Question

Generally speaking--it is best to apply standarizaton (z-scoring the training data) prior to regularization. Does sklearn.linear_model.SGDClassifier automatically standardize the training data or not when the 'penalty' argument is set to a value other than none (i.e. 'l2', 'l2', or 'elasticnet')?

Was it helpful?

Solution

No, sklearn generally doesn't apply scaling inside of any of its models, instead relying on the user to do that. This seems like the right way to do it, since you might want to try different scaling techniques depending on your data.

From the User Guide:

Stochastic Gradient Descent is sensitive to feature scaling, so it is highly recommended to scale your data. For example, scale each attribute on the input vector X to [0,1] or [-1,+1], or standardize it to have mean 0 and variance 1...

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top