I am trying to write a model that has the input vector of one embedding (say $E_1$) and predicts the corresponding vector in the second embedding $E_2$. Both are n-dimensional real dense vectors $\mathbb{R}^n$.

Concretely one is a skipgram word embedding, and the other is a node2vec graph embedding. I have approximately 30 000 training examples that provide a mapping between the two. Since they are both just real vectors, it seems like a trivial task to write a simple MLP that learns the non-linear transformation of one to the other (I actually don't really care about over fitting here, since the domain is closed).

However, I can't seem to get it to work properly.

In Keras, naively something like this should work:

in = Input(e1_dim)
hidden = Dense(some_value, activation="tanh")(in)
out = Dense(e2_dim)(hidden)

I have tried adding more hidden layers, but I think my problem lies with the fact that the input and output vectors are in the domain (-1,1), and so the choice of initializer, loss function, and activation function is critical.

I have tried setting the initializer to RandomUniform, but still no good results. For loss I have tried MAE en cosine_proximity, but both seem to produce terrible results. In particular cosine_proximity seems to not get above -0.5 which might be a sign. Any thoughts on the choice of architecture and loss function for mapping one embedding onto another (essentially a high-dimensional non-linear regression?)

没有正确的解决方案

许可以下: CC-BY-SA归因
scroll top