Differerent ways to detect the appropriate number of topics
-
12-12-2020 - |
Frage
I implemet the LDA topic modeling in R.
One essential parameter is the selection of the number of topics
Which of the following ways could it be the most suitable:
1. mallet
2. stm
3. or this way https://cran.r-project.org/web/packages/ldatuning/vignettes/topics.html
Lösung
There's no "most suitable" way, but there might be one which works better with your data. The only way to know that would be to try all of them.
In case choosing the number of topics is an issue, you might be interested in using the non-parametric extension of LDA for topic modeling, which doesn't require you to specify the number of topics: this is called Hierarchical Dirichlet Processes, see for instance this introduction.
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit datascience.stackexchange