Regex for a word of a concatenation of 'a's and 'b's. a appears exactly n*2 times in a word and 'b' exactly n*3 times

StackOverflow https://stackoverflow.com//questions/20036291

  •  21-12-2019
  •  | 
  •  

Question

I need to create a NFA (or DFA, don't know which it is going to be yet) for this language:

L(A) = { w ∈ {a,b}* | count(w, a) = 2i, count(w, b) = 3j, i, j ∈ ℕ }

count(w, z) defines how often a letter z appears in the word w. In this case it's a multiple of 2 for 'a', and a multiple of 3 for 'b'.

Examples of a valid word: babab, aabbb, bbaab, bbbabbba

I struggled creating an automaton for that, so I thought I'd try creating a regular expression for it first to turn it to a NFA using this method, because I can easily test it on a regex test sites. But that too didn't work. I ended up with too many combinations that seemingly had no end.

I don't know how to create a regular expression for that without using some sort of counting mechanism. Can somebody give me a hint?

Was it helpful?

Solution

You can model your automaton as six different states, each one describing the state of (count(w, a) mod 2, count(w, b) mod 3). There are six different possible states:

(0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2)

Every time you read an "a", you go from the current state (e.g. (0, 1)) to the next a-corresponding state (-> (1, 1)). If you read another "a", you go back to the same state again (-> (0, 1)). The same with "b"s, but changing the b-value (e.g. (0, 1) -> (0, 2) -> (0, 0) -> (0, 1)).

The only permited accept state is (0, 0).

Using visual notation:

http://cdn.imghack.se/images/ccea2c62451d81e477f73ac6fabc5134.png

OTHER TIPS

Lets see, you need a even number of a's, that gives you two states

a-even
a-odd

and you need a multiple of three for b's:

b-ok
b-one
b-two

Combining these states, we can build the following grammar:

S(*) -> a A | b B1
A -> a S | b B1A
B1 -> b B2 | a B1A
B1A -> a B1 | b B2A
B2 -> b S | a B2A
B2A -> a B2 | b A

So, this is a regular grammar, all rules have right recursion with one terminal simbol and at most one non-terminal, hope the grammar helps.

The simplest approach to validating your input is to use a regex for your requirement:

^(?=((b*a){2})*b*$)(?=((a*b){3})*a*$)[ab]*$

This uses look aheads to assert the quantity of "a" is a multiple of 2 (including zero), and similarly for "b" bring s multiple if 3.

See a live demo of this regex working with your examples and some invalid examples.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top