First answer: Yes, you are exactly right.
Second answer: The alpha is a K-vector, as you mentioned. When we take a sample from the Dirichlet distribution, we get another K-vector. The values themselves would depend on the values of alpha, but they all sum to 1 (which is how they can be considered the proportions of all topics in one document). We sample once per document, to obtain M vectors - that's how we get the MxK matrix theta.
The length of the vector we get from sampling the Dirichlet distribution depends on the length of its parameter, alpha.