Question

I have a list of stock price sequences with 20 timesteps each. That's a 2D array of shape (total_seq, 20). I can reshape it into (total_seq, 20, 1) for concatenation to other features.

I also have news title with 10 words for each timestep. So I have 3D array of shape (total_seq, 20, 10) of the news' tokens from Tokenizer.texts_to_sequences() and sequence.pad_sequences().

I want to concatenate the news embedding to the stock price and make predictions.

My idea is that the news embedding should return tensor of shape (total_seq, 20, embed_size) so that I can concatenate it with the stock price of shape (total_seq, 20, 1) then connect it to LSTM layers.

To do that, I should convert news embedding of shape (total_seq, 20, 10) to (total_seq, 20, 10, embed_size) by using Embedding() function.

But in Keras, the Embedding() function takes a 2D tensor instead of 3D tensor. How do I get around with this problem?

Assume that Embedding() accepts 3D tensor, then after I get 4D tensor as output, I would remove the 3rd dimension by using LSTM to return last word's embedding only, so output of shape (total_seq, 20, 10, embed_size) would be converted to (total_seq, 20, embed_size)

But I would encounter another problem again, LSTM accepts 3D tensor not 4D so

How do I get around with Embedding and LSTM not accepting my inputs?

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top