Estimate Markov Chain Transition Matrix in MATLAB With Different State Sequence Lengths

Question 1

So for Markov chains, I assume you're only interested in the state transitions. You could group all state transitions into a single Nx2 matrix and then count the number of times a row appears.

For this example I'm using three observations of length 4, 3, and 3. I can use cellfun to group all the state transitions together in a single matrix in the following way:

obs = cell(1, 3);

obs(1) = {[1 2 3 4]};
obs(2) = {[4 5 6]};
obs(3) = {[3 4 5]};

transitions = cellfun(@(x)([x(1:length(x)-1); x(2:length(x))]), obs, 'UniformOutput', false);

alltransitions = cell2mat(transitions)';

Which gives me my observed transitions (1->2, 2->3, 3->4 ...):

alltransitions =

     1     2
     2     3
     3     4
     4     5
     5     6
     3     4
     4     5

To set up the transition matrix, you could take the advice listed here, and count the rows of all of your transitions:

http://www.mathworks.it/matlabcentral/answers/75009-i-ve-a-matrix-of-6x4-and-i-want-to-count-the-rows-how-many-times-it-occur-in-a-matrix

[uniqueTransitions, ~, i]=unique(alltransitions,'rows','stable');
v=arrayfun(@(x) sum(i==x),1:size(uniqueTransitions,1))';
p = v/sum(v);

My vector p contains my transition probability, so I can then go ahead and construct a sparse matrix

transitionMatrix = sparse(uniqueTransitions(:,1), uniqueTransitions(:,2), p, 6,6)

which results in:

transitionMatrix =

   (1,2)       0.1429
   (2,3)       0.1429
   (3,4)       0.2857
   (4,5)       0.2857
   (5,6)       0.1429

Question 2

This maybe an easier way for calculating the transition probability matrix (TPM) for a given sequence of data (one vector) like this:

myS = {S1,S2,S1,S3,...} with as many states as you have;

TPM = hmmestimate(myS,myS);

hmmestimate function is defined under hidden markov models in MATLAB.