Tensorflow: how to look up and average a different amount of embedding vectors per training instance, with multiple training instances per minibatch?

https://datascience.stackexchange.com/questions/38887

31-10-2019
|

题

In a recommender system setting: let's say I want to learn to predict future item purchases based on user past purchases using an approach inspired by Youtube's recommender system:

Concretely, let's say I have a trainable content-based network that receives as input an item and, based on its content, returns an embedding for such item. Now, let's say each user has purchased a variable number of items in the past (some users might have purchased 5 items, others maybe 1, others maybe 10, some outliers maybe 100, etc.). I want to generate a user vector, a candidate item vector and then a user-item match score as follows:

I map each item purchased by that user to its embedded item vector using the trainable content-based network
I calculate the average of all those embedded item vectors (as illustrated in the picture)
I apply a couple of ReLu layers on top of this average, thus obtaining a user vector
I map a candidate item (to be recommended) to its embedded item vector using the same trainable content-based network of step 1 (the weights of this network are always shared, like a Siamese network so to speak)
Finally, I compute the dot product between the user vector and the candidate item vector, apply a cross entropy loss during training, etc.

So my question is about the technical details of how to implement the embedding lookup and average of a variable number of embedded item vectors per user using Tensorflow, considering that during training each mini-batch may contain many training instances, where each training instance possibly consists of a different user with a different amount of purchased items in the past. Although the context is different, my question is very similar to this one, but unfortunately nobody has answered that question up to now.

没有正确的解决方案

许可以下： CC-BY-SA 和归因

不隶属于 datascience.stackexchange