how to use word embedding to do document classification etc?

https://datascience.stackexchange.com/questions/60674

nlp
classification
random-forest
word-embeddings
word2vec

02-11-2019
|

Pergunta

I just start learning NLP technology, such as GPT, Bert, XLnet, word2vec, Glove etc. I try my best to read papers and check source code. But I still cannot understand very well.

When we use word2vec or Glove to transfer a word into a vector, it is like:

[0.1,0.1,0.2...]

So, one document should be like:

[0.1,0.1,0.2...]
[0.1,0.05,0.1...]
[0.1,0.1,0.3...]
[0.1,0.15,0.1...]
.......

So, one document is a matrix. If I want to use some traditional method like random forest to classify documents, how to use such data? I was told that Bert or other NLP models can do this. But I am really curious about how the word embedding are applied in the traditional methods?

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição

Não afiliado a datascience.stackexchange