I think something like this may be what you want:
for (ii in 1:nrow(result$topics)) {
print(
head(
cumsum(
sort(result$topics[ii,], decreasing=TRUE)
),
n = 20
) / result$topic_sums[ii]
)
}
Let's break it down. If you want the fraction of Gibbs assignments, then that is easy. The LDA routine returns the number of assignments to each (word, topic) pair. So all you have to do is sort each row of the result$topics
to get the top words (this is essentially what top.topic.words
does if you set by.score=FALSE
). Once you have it in sorted order you can just see, for each topic, how many counts occur for that word versus for the entire topic. To do that I divide by result$topic_sums
which contains the total number of assignments of that topic. Finally, I use cumsum
so you can see the running total weight for words in that topic.