Question

If we describe regular expressions with operators *, | and concatenation . (which we simply omit for clarity), and parenthesis (, ) and some letters from some alphabet Sigma, then is the language that describes regular expressions itself regular? In my opinion no, because the fact that we have parenthesis means that no finite state machine can recognise the input, so it has to be a context free language.

Note on off-topic-ness

I stand by my position that this question is relevant to programming since I came up with it while thinking about coding up a regular expression recogniser. If someone wants to implement such a thing, then one will soon realise that you actually need a context-free parser to parse regular expressions and this question will answer this. Moreover, the answer and question are not 'very theoretic' since the topic of finite automata is considered year 1&2 undergraduate material so putting it in the Theoretical Computer Science Stackexchange will be an overkill.

Was it helpful?

Solution

No, it is not regular. Consider that the language of balanced parenthesis alone is not even regular. The proof by contradiction using the pumping lemma for the language of balanced parenthesis also works for the language of regular expressions.

It is Context Free though, and it is very easy to describe with a Context-free grammar:

S -> SS
S -> S|S
S -> S*
S -> (S)
S -> a
S -> b
S -> c
S -> ...     // Continue for all terminals in the alphabet, and
S -> epsilon // The empty string
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top