Generally speaking--it is best to apply standarizaton (z-scoring the training data) prior to regularization. Does sklearn.linear_model.SGDClassifier automatically standardize the training data or not when the 'penalty' argument is set to a value other than none (i.e. 'l2', 'l2', or 'elasticnet')?

有帮助吗?

解决方案

No, sklearn generally doesn't apply scaling inside of any of its models, instead relying on the user to do that. This seems like the right way to do it, since you might want to try different scaling techniques depending on your data.

From the User Guide:

Stochastic Gradient Descent is sensitive to feature scaling, so it is highly recommended to scale your data. For example, scale each attribute on the input vector X to [0,1] or [-1,+1], or standardize it to have mean 0 and variance 1...

许可以下: CC-BY-SA归因
scroll top