Answer to this is given in the HMM tutorial paper by Rabiner, Section V-C, Pg 273:
Basically, there is no simple or straightforward answer to the above question. Instead, experience has shown that either random (subject to the stochastic and the nonzero value constraints) or uniform initial estimates of the prior probabilities and the transition matrix is adequate for giving useful reestimates of these parameters in almost all cases.
However, for the emission matrix, experience has shown that good initial estimates are helpful in the discrete symbol case, and are essential (when dealing with multiple mixtures) in the continuous distribution case.**
Such initial estimates can be obtained in a number of ways, including:
1) manual segmentation of the observation sequences into states with averaging observations within states,
2) maximum likelihood segmentation of observations with averaging,
3) segmental k-means segmentation with clustering,
etc.