Format pour X_train en utilisant keras Théano

https://datascience.stackexchange.com/questions/17083

22-10-2019
|

Question

Je veux essayer Keras (Théano back-end) pour les régressions après utilisent déjà sklearn.

Pour cela, j'utilise ce beau tutoriel http: // machinelearningmastery .com / régression-tutorial-keras-deep-apprentissage-bibliothèque-python / et a essayé de remplacer les données de formation là-bas avec mon propre.

import numpy
import pandas
import pickle
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler


[X, Y] = pickle.load(open("training_data_1_week_imp_lt_15.pkl", "rb"))


X_train, X_test, y_train, y_test = train_test_split( X, Y, test_size=0.5, random_state=42)
scaler = StandardScaler()
scaler.fit(X_train)  # Don't cheat - fit only on training data
X_train = scaler.transform(X_train)

X_test = scaler.transform(X_test)

print (X_train.shape)


# define base mode
def baseline_model():
    # create model
    model = Sequential()
    model.add(Dense(8, input_dim=8, init='normal', activation='relu'))
    model.add(Dense(1, init='normal'))
    # Compile model
    model.compile(loss='mean_squared_error', optimizer='adam')
    return model


# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# evaluate model with standardized dataset
estimator = KerasRegressor(build_fn=baseline_model, nb_epoch=100, batch_size=5, verbose=0)    

estimator.fit(numpy.array(X_train),y_train)

Cependant, je reçois l'erreur suivante:

Exception: Error when checking model target: the list of Numpy arrays that you are 
passing to your model is not the size the model expected. Expected to see 1 
arrays but instead got the following list of 6252 arrays: ...

Le format de X est le format habituel sklearn: print (X_train.shape) = (6252, 8)

Comment formater mon entrée X correctement.

Ce que j'ai essayé mais a été transposait cela n'a pas marché.

J'ai aussi déjà cherché sur le web, mais n'a pas pu trouver une solution / explication.

Merci!

EDIT: Voici un petit exemple de fichier https://ufile.io/8a428

[X, Y] = pickle.load(open("test.pkl", "rb"))

La solution

Je résolu ce (cognant toujours ma tête contre le mur):

estimator.fit(numpy.array(X_train),numpy.array(y_train))

cela fonctionne. Je ne suis pas sûr pourquoi. Le message d'erreur est très trompeur à mon humble avis.

Licencié sous: CC-BY-SA avec attribution

Non affilié à datascience.stackexchange