Question

I heard that some operations involving regexes that do not have common elements can NOT be generated using a finite automata. I do not remember what it was, where it was from, can anyone tell me what that could possibly be?

Edit : I finally managed to find the exact statement. Here it is, below :

A language (like $(00)^n (11)^n$) can not be generated by an FSM since there is no common character in "00" and "11". We need to keep count of both which is not possible with a finite automata as the count can be infinite.

Was it helpful?

Solution

The remark by the tutor is very poorly worded. My best guess is that it refers to the following situation. Let $L_{x,y}$ be the set of all words that have as many copies of $x$ as that of $y$. Then $L_{0,1}$ is not regular, but $L_{01,10}$ is regular, since it equals $(0^+1^+)^*0^+ + (1^+0^+)^*1^+ + \epsilon$.

OTHER TIPS

One has to differentiate between regex in computer science and regex in real world programming languages. In compsci regular expressions are defined to only recognize regular languages, which are the same languages DFAs can recognize, thereby making DFAs and regex a descriptions of the same class of languages.

In real world application the recognition of regular languages is somewhat limiting which is the reason why many regex libraries introduce operators which allow you to recognize languages that are not regular. One such operator is recursion (?R) see here for an example using python. Notice that the language in the article is the balanced parentheses language which is a well known context free language that is not regular and thus can not be recognized by a DFA.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top