Using embedding layer output as input to .fit() call in Keras

https://datascience.stackexchange.com/questions/66412

20-10-2020
|

Pergunta

I want to build a classifier in Keras that predicts the next item bought by a customer (i.e. multiclass classification). One of the features I intend to input to the model will be the last item bought by a particular customer. My problem is that the list of possible items is extremely large, several tens of thousands. With that in mind, I'm going to feed the ItemID feature into an Embedding Layer in Keras, and concatenate that with other features before running it through the model.

My question is, can I use the Keras shared layer functionality to also embed the labels for the training data (since they are all from the same vocabulary as that embedded ItemID input), such that instead of having a softmax output of n_classes I have an output of n_dimensions, where n_dimensions is however many dims I decided to set that Embedding layer too?

Solução

I got it. You define a new model, which has an input, the shared embedding layer and a flattened output. Pass the output of .predict() from that model to y parameter of your main model's .fit() call, in like fashion:


NUMERIC_FEATURES = [
    # Define the subset of features that need passing to the numeric input layer
]

vocab_size = 10000 # number of items
n_dimensions = 32 # dimensions to embed down to

# Shared layer for Item ID embeddings.
itemEmbedding = Embedding(vocab_size, n_dimensions, name='Item-Embedding')

# "Main" model definition, which has ItemID feature plus a bunch of
# numeric features to input
ii = Input(shape=(1,), name='Item-Input')
ie = itemEmbedding(ii)
if = Flatten()(ie)

ni = Input(shape=(7,), name='Numeric-Inputs')

c = Concatenate()([ni, if])

d1 = Dense(512, activation='relu')(c)
d2 = Dense(256, activation='relu')(d1)
d3 = Dense(128, activation='relu')(d2)
o = Dense(n_dimensions, activation='relu')(d3)

model = Model(inputs=[ni, ii], output=o)
# Don't take my word for the loss here, this is a toy example of code that works but is not
# intended to be completely "correct"
model.compile(optimizer='adam', loss='mse') 


# "Labels" model definition, which embeds the labels using the shared
# embedding layer
li = Input(shape=(1,), name='Label-Input')
le = itemEmbedding(li)
lf = Flatten(le)

labelModel = Model(inputs=li, output=lf)

# Train model, using output of labelModel.predict() as the y parameter
model.fit([df[NUMERIC_FEATURES], df.ItemID], labelModel.predict(df.ItemID))

That's how it's done, whether it's useful, I have not yet evaluated.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange