Document embedding vs locality sensitive hashing for document clustering
質問
I would like to compare two methods: locality sensitivity hashing and document embedding to get the similarity between two documents. Both of those methods encode information of a document in a vector which I would like to use to find similar documents in a very large corpus (potentially more than 100 000 documents). Have anybody ever compared those two methods and what are the advantages of each of them.
Cheers in advance
正しい解決策はありません
所属していません datascience.stackexchange