Found input variables with inconsistent numbers of samples
-
16-10-2019 - |
문제
I would appreciate if you could let me know how to resolve this error: Code:
X = np.array(pd.read_csv('my_X_table1-1c.csv',header=None).values)
y = np.array(pd.read_csv('my_y_table1-1c.csv',header=None).values.ravel())
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=7)
def Ridgecv(alpha):
return cross_val_score(Ridge(alpha=float(alpha), random_state=2),
X_train, y_train, 'mae', cv=5).mean()
The error is related to X_train, y_train
:
ValueError: Found input variables with inconsistent numbers of samples: [1052, 1052, 3]
regards,
해결책
It seems that I missed the word "scoring". In fact, the extra 3 was related to the number of characters of 'mae'.
def Ridgecv(alpha):
return cross_val_score(Ridge(alpha=float(alpha), random_state=2),
X_train, y_train, scoring='mae', cv=5).mean()
다른 팁
It should be in sequence:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X,Y,random_state=101,test_size=0.3)
and then it should be in fit method(x_train,y train)
제휴하지 않습니다 datascience.stackexchange