DFA with limited states
-
16-10-2019 - |
Question
Lets $L_z \ := \{ a^i b^i c^i : 0 \leq i < z \}$
$\{a,b,c\} \in \sum^*$
there is a DFA with $\frac{z(z+1)}{2}+1$ states - How can I prove this?
And I need largest possible number $n_z$, for which i can prove that every NFA, which accepts $L_z$, have $n_z$ states, at least!
But first I need to show that $n_z = \frac{z(z+1)}{2}$ right?
Solution
Constructing a DFA
There is an algorithm to construct a minimal DFA for a regular language. In this case, it isn't too hard to come up with a DFA with $z(z+1)/2$ states.
Observe that $L_z = \{\epsilon, abc, aabbcc, \ldots, a^{z-1}b^{z-1}c^{z-1}\}$. Once you know the number of $a$'s at the beginning of the word, you know how many $b$'s and $c$'s to expect. Here's a simple DFA built from this observation (with $z' = z-1$):
[source]
Following the usual convention, if there is no transition for a particular letter from a particular node, the automaton goes to a sink state.
This automaton has $n$ states reached after reading $0$ to $z-1$ letters a
($k,0,0$ for $0\le k\le z-1$), and branches of length $2k$ for $0 \le k \le z-1$ to read the right number of b
's and c
's ($k,i,0$ and $k,k,i$ for $1 \le i \le k \le z-1$). That's a total of $z + \sum_{k=0}^{z-1} (2k) = z^2$ states, plus one to account for the implicit sink state.
It's possible to compress this automaton a bit.
Notice that every side branch finishes by counting the
c
's. Instead of tracking how manyc
's have already been seen, think in terms of how manyc
's are left to read. You can merge all the states of the form $k,k,k-n$, for all $k \ge 1$, into a state $-,n$ from which $n$ occurrences ofc
must be read. In particular, there will be only two final states: one for the empty word, and one after reading the right number ofc
's. In this automaton, the $k$th side branch has $k$ edges to read theb
's (so $k-1$ states, as the start state belongs to thea
trunk, and the end state belongs to thec
collector), and there are $z-1$ transitions to count thec
's (so $z$ states, corresponding to $0$ through $z-1$c
's left).
The total number of states in this automaton (including the sink state) is $z + \sum_{k=1}^{z-1} (k-1) + (z-1) + 1 = z(z+1)/2 + 1$.
Constructing an NFA
In other words, $n_z$ is the number of states in a minimal NFA that recognizes $L_z$. If you have a DFA with $N$ states, that's an NFA with $N$ states. So $n_z \le z(z+1)/2+1$. For most languages, a minimal NFA is smaller than a minimal DFA: nondeterminism adds expressive power in terms of how much you can accomplish with a number of states.
In this particular case, it happens that the minimal DFA above is also a minimal NFA (if I didn't make a mistake). Here's an informal proof.
Consider the states in the middle of reading the
b
's. When reading $a^k b^i | b^{k-i} c^k$, where $|$ represents the current position, the state must account for both $k$ and $i$, to remember both the number of seena
's (which is necessary, because we'll need to count thec
's later) and the number of seenb
's (or, equivalently, the number ofb
's still required). This requires distinct states for every $(i,k)$ such that $1 \le i \lt k \le z-1$. In addition, there must be $z-1$ states to count thea
's at the beginning, and $z-1$ states to count thec
's at the beginning, plus a sink state. That's a total of $z(z+1)/2+1$ states.
OTHER TIPS
I will give you an indication: first, write an automaton which recognizes exactly the language $M_n=\{b^nc^n\}$. How many states has this automaton?
Now use the pumping lemma to prove that this is the minimum number of states for $M_n$. The pumping lemma can be refined for automata with fixed number of states which is exaclty wikipedia's $p$ in "there exists $p$ such that ...".
Use the automata for $M_n$ to build your DFA for $L_z$.
EDIT: I removed the last part: you can use the pumping lemma to prove the minimum size for $M_n$ but not (at least not directly) for $L_z$.
To prove that you got the minimum size for the automaton, you can reason about the languages corresponding to each state, corresponding to the residual languages $w^{-1}L_z$. The number of such residual languages is a lower bound on the number of states for $L_z$.