Question

Specifically, is there a library that, when given 2 (or more) regular expressions, can tell if exists an input that both would match? Bonus points if it's easily accessible via Java or .NET, but command-line would be fine as well.

Asker's log, supplemental:

The regular expressions that would be fed to this algorithm are fairly simple. While I believe there are a couple with lookaheads, they are all fairly simple combinations of literals or character classes with fixed minimum and maximum lengths.

Was it helpful?

Solution

I found a python library that lets me do what I need to do.

>>> import reCompiler
>>> fsa1 = reCompiler.compileRE('\d\d\d?\d?a')
>>> fsa2 = reCompiler.compileRE('123a')
>>> fsa3 = reCompiler.compileRE('a23a')
>>> print len(FSA.intersection(fsa1, fsa2).finalStates)
1
>>> print len(FSA.intersection(fsa1, fsa3).finalStates)
0

The library is called pyFSA. I will need to implement some preparsing to turn statements like \d{2,4} into \d\d\d?\d?, but other than that it should suit my needs nicely. Thanks for the input, and if people find libraries that implement this in other languages, by all means include them.

OTHER TIPS

If there were it would not run in a useful amount of time. Comparing regex is a PSPACE problem

http://en.wikipedia.org/wiki/PSPACE-complete

You may have some luck if you can allow extra restrictions on your regex

If, I understand you correctly, you would like to know if the intersection of 2 regular expressions is the empty set or not? I believe that's hard, but I wouldn't be surprised if the complexity was exponential in the length of the regex (though some regexes would obivously be easier than others)

Regardless, here's a Haskell implementation: http://sulzmann.blogspot.com/2008/11/playing-with-regular-expressions.html

And a prolog implementation http://www.let.rug.nl/vannoord/Fsa/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top