Domanda

Are there such things?

Like for example, S -> aSb | ^ (possible words: ^, ab, aabb, aaabbb, aaaabbbb, ...)

From what I've learned, the only regex that closely match the said grammar is: a*b*

But the regex can produce words such as aab, abb, ... where a's and b's aren't equal.

Is there a solution to this? Something like: a*b* if #a = #b

EDIT: I think there is no solution to this.

What is the correct explanation for this? This is actually a snippet of my homework, and I really don't know what to answer since there are no solutions in translating the grammar to regex.

È stato utile?

Soluzione

If you are talking about formal language theory then of course all non-regular grammars (as in your example) can't be expressed with a regular expression (per definition).

But if you are wondering of what different regex flavors (in programming languages/regex libs) can do, then you can match all kinds of non-regular grammars/languages.

For example in Perl/PCRE you can match your example language with any of these:

  • Using recursion/sub pattern calls:

    ^(a(?1)b)$

  • Using a backreference (with a conditional):

    ^(?:a(?=a*(b(?(1)\1))))+\1$|^$

You may be interested in this questions and answers: Match a^n b^n c^n (e.g. "aaabbbccc") using regular expressions (PCRE)

Altri suggerimenti

In formal language theory something called a "pumping lemma" can be used to prove that certain sets of sentences (languages) can not be described by a regular expression. See wikipedia http://en.wikipedia.org/wiki/Pumping_lemma_for_regular_languages. You start from the language you want to describe and use the pumping lemma to find a contradiction. The proof for your example is actually on that wikipedia page.

A similar theory exists for context-free languages. Some languages just can not be described by context-free grammars.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top