Question

I have a question about the inference result of lda-c-dist package. How many words should be displayed when viewing results of inference? For example, if I set number of words to a very large number N(assume number of all terms are N), it seems to exist some groups of words. In each group, the index of words are ranging from 1 to N.

What I got is like, Assume number of terms is 10, and I assign the number of words displayed to 10.

Topic 0xx:
001
008
009
002
003
007
000
004
005
006

It seems that, may be I should set words displayed 3, not 10.

So, as to one topic, when viewing topics by calling topics.py, how many words should be specified?

Besides, I'm going to use this output to calculate the similarity of two topics. So ...

Was it helpful?

Solution

Actually, there can be as many items as the vocabularies are. What is displayed here, is just a probability descending order for a limited number indicated.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top