They are imagining a fixed model for the data that was established long before that specific message is to be encoded. The model was in principle constructed from a large ensemble of such messages, so there is no reason to believe that eaii!
by itself should match the probabilities in the model. Of course, the model is just for illustration purposes, and no more real than the eaii!
message. (Though I think I said exactly that the other day when I was pulling something out of the oven.)
The order of the symbols in the model is arbitrary. It just needs to be the same model on both ends. It is of course important that the probabilities add up to one.
The second model is simply another arbitrary model to illustrate how a symbol can be coded in less than a bit, when it has a probability greater than 1/2. For that model, each a
in a series of a
's would take a little over half a bit.