Dropping one category for regularized linear models
Question
While reviewing the sklearn's OneHotEncoder documentation (attached below) I noticed that when applying regularization (e.g., lasso, ridge, etc.) it is not recommended to drop the first category. While I understand why dropped the first category prevents collinearity, I am unsure why it is needed for regularized regression. Wouldn't this this add an additional dimension that will need to be regularized?
drop{‘first’, ‘if_binary’}
Specifies a methodology to use to drop one of the categories per feature. This is useful in situations where perfectly collinear features cause problems, such as when feeding the resulting data into a neural network or an unregularized regression. However, dropping one category breaks the symmetry of the original representation and can therefore induce a bias in downstream models, for instance for penalized linear classification or regression models.
Solution
When you do linear regression you have to leave out one column as it's a singular matrix and hence columns are linearly dependent and we cannot calculate the inverse.
But when you do regularization it take cares of singularity. The matrix is almost surely nonsingular
. Hence we don't need to drop a column and if you drop different columns from each feature it could lead to different prediction as it would lead to bias.
Refer this.