In LDA model, how are the multinomial parameters (theta) drawn from the Dirichlet prior weight (alpha)?

https://stackoverflow.com/questions/18180639

24-06-2022
|

Question

I'm a freshman who is studying LDA (Latent Dirichlet Allocation) model nowadays. But, I faced a problem.

How is the theta drawn from the alpha?

theta ~ Dir(alpha)

According to my short understanding, the variable theta is a vector with its length K and its components represent the topic proportions in a document. And, the thetas are different with each other for each document. And, in corpus level, the alpha is still a K-vector whereas the theta is a M(# of docs) by K(# of topics) sized matrix.

First question: What I mentioned above is true?

Second question: If true, over the documents, how can the different thetas (K-vectors) be drawn from the same Dirichlet distribution?

Solution

First answer: Yes, you are exactly right.

Second answer: The alpha is a K-vector, as you mentioned. When we take a sample from the Dirichlet distribution, we get another K-vector. The values themselves would depend on the values of alpha, but they all sum to 1 (which is how they can be considered the proportions of all topics in one document). We sample once per document, to obtain M vectors - that's how we get the MxK matrix theta.

The length of the vector we get from sampling the Dirichlet distribution depends on the length of its parameter, alpha.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow