Question

If I'm not mistaken, in this paper here http://svn.aksw.org/papers/2015/WSDM_Topic_Evaluation/public.pdf it appears that topics with larger number of topics will inherently have larger coherence measures because of the summation over all the topics. Is there any flaw in dividing this coherence calculation by the number of topics to get an average for a model of 5 topics and then for a model of 10 topics and comparing the two?

In gensim specifically, I'm using c_v coherence calculation.

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with datascience.stackexchange
scroll top