scikit-learn Ridge Regression UnboundLocalError

https://stackoverflow.com/questions/23403108

13-07-2023
|

题

I'm just a beginner and I'm trying to implement polynomial regression in scikit-learn. The usual regression without regularization works fine

regr = linear_model.LinearRegression(copy_X=True)
X = np.array(time_list[0:24]).reshape(24,1)
for i in range(2,10):
   X=np.append(X, X**i, 1)
Y = np.array(tempm_list[0:24]).reshape(24,1)

regr.fit(X, Y)

but when I try to implement Ridge regression in exactly the same way, I get the following error:

regularized_regr=linear_model.Ridge(alpha =1, copy_X=True)
regularized_regr.fit(X,Y)


File "/usr/local/lib/python2.7/site-packages/sklearn/linear_model/ridge.py", line 449,    in fit
return super(Ridge, self).fit(X, y, sample_weight=sample_weight)
File "/usr/local/lib/python2.7/site-packages/sklearn/linear_model/ridge.py", line 338, in fit
solver=self.solver)
File "/usr/local/lib/python2.7/site-packages/sklearn/linear_model/ridge.py", line 294, in ridge_regression
coef = safe_sparse_dot(X.T, dual_coef, dense_output=True).T
UnboundLocalError: local variable 'dual_coef' referenced before assignment

Thanks

解决方案

First suggestion: Decrease your polynomial degree to e.g. <= 5. Anything above will enter the realm of overfitting given your number of samples

Second suggestion: Upgrade Scikit learn to the bleeding edge github version, this seems to be a bug related to an exception raised because your matrix is singular.

If you cannot upgrade scikit learn, try using a stronger regularization:

import numpy as np
_, S, _ = np.linalg.svd(X, full_matrices=False)
s = S[0]

alpha = 1.2 * s  # you may vary this fraction between 0.1 and larger

regularized_regr=linear_model.Ridge(alpha=alpha)
regularized_regr.fit(X,Y)

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow