I think this tutorial will help you understand everything very clearly - https://www.youtube.com/watch?v=DDq3OVp9dNA
I too faced a lot of problems understanding it at first. I'll try to outline a few points in a nutshell.
In Latent Dirichlet Allocation,
- The order of words is not important in a document - Bag of Words model.
- A document is a distribution over topics
- Each topic, in turn, is a distribution over words belonging to the vocabulary
- LDA is a probabilistic generative model. It is used to infer hidden variables using a posterior distribution.
Imagine the process of creating a document to be something like this -
- Choose a distribution over topics
- Draw a topic - and choose word from the topic. Repeat this for each of the topics
LDA is sort of backtracking along this line -given that you have a bag of words representing a document, what could be the topics it is representing ?
So, in your case, the first topic (0)
INFO : topic #0: 0.181*things + 0.181*amazon + 0.181*many + 0.181*sells + 0.031*nokia + 0.031*microsoft + 0.031*apple + 0.031*announces + 0.031*acquisition + 0.031*product
is more about things
, amazon
and many
as they have a higher proportion and not so much about microsoft
or apple
which have a significantly lower value.
I would suggest reading this blog for a much better understanding ( Edwin Chen is a genius! ) - http://blog.echen.me/2011/08/22/introduction-to-latent-dirichlet-allocation/