Question

Supposing we have two grammars which define the same languge: regular one and LALR(1) one.

Both regular and LALR(1) algorithms are O(n) where n is input length.

Regexps are usually preferred for parsing regular languages. Why? Is there a formal proof (or maybe that's obvious) that they are faster?

Was it helpful?

Solution

You should prefer stackless automaton over pushdown one as there is much more developed maths for regular language automatons.

We are able to perform determinization for both types of automaton, but we are unable to perform efficient minimization of PDA. The well known fact is that for every PDA there exists equivalent one with the only state. This means that we should minimize it with respect to transitions count/max stack depth/some other criteria.

Also the problem of checking whether two different PDAs are equivalent with respect to the language they recognize is undecidable.

OTHER TIPS

There is a big difference between parsing and recognizing. Although you could build a regular-language parser, it would be extremely limited, since most useful languages are not parseable with a useful unambiguous regular grammar. However, most (if not all) regular expression libraries recognize, possibly with the addition of a finite number of "captures".

In any event, parsing really isn't the performance bottleneck anymore. IMHO, it's much better to use tools which demonstrably parse the language they appear to parse.

On the other hand, if all you want to do is recognize a language -- and the language happens to be regular -- regular expressions are a lot easier and require much less infrastructure (parser generators, special-purpose DSLs, slightly more complicated Makefiles, etc.)

(As an example of a language feature which is not regular, I give you: parentheses.)

People prefer regular expressions because they're easier to write. If your language is a regular language, why bother creating a CFG grammer for it?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top