Frage

I implemet the LDA topic modeling in R.

One essential parameter is the selection of the number of topics

Which of the following ways could it be the most suitable:

1. mallet
2. stm
3. or this way https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
War es hilfreich?

Lösung

There's no "most suitable" way, but there might be one which works better with your data. The only way to know that would be to try all of them.

In case choosing the number of topics is an issue, you might be interested in using the non-parametric extension of LDA for topic modeling, which doesn't require you to specify the number of topics: this is called Hierarchical Dirichlet Processes, see for instance this introduction.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit datascience.stackexchange
scroll top