Pergunta

I just start learning NLP technology, such as GPT, Bert, XLnet, word2vec, Glove etc. I try my best to read papers and check source code. But I still cannot understand very well.

When we use word2vec or Glove to transfer a word into a vector, it is like:

[0.1,0.1,0.2...]

So, one document should be like:

[0.1,0.1,0.2...]
[0.1,0.05,0.1...]
[0.1,0.1,0.3...]
[0.1,0.15,0.1...]
.......

So, one document is a matrix. If I want to use some traditional method like random forest to classify documents, how to use such data? I was told that Bert or other NLP models can do this. But I am really curious about how the word embedding are applied in the traditional methods?

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
scroll top