Question

I want to match multiple regular expressions against a single string & stop when the first regular expression matches.

I am exploring few solutions from here http://sljit.sourceforge.net/regex_perf.html

but none of them seem to take into consideration match of multiple regular expression against single string.

Is there any solution to speed this up?

Was it helpful?

Solution

You could just use alternation. That is, if you're looking for the expressions \a+b\ and \[a-z0-9]+xyz\, you could write a single regular expression with grouping: \(a+b)|([a-z0-9]+xyz)\. The regex engine will return the first match it finds.

The Unix fgrep tool does what you're looking for. If you give it a list of expressions to find, it will find all occurrences in a single scan of the file. Dr. Dobb's Journal published an article about it, with C source, sometime back in the late '80s. A quick search reveals that the article was called Parallel pattern matching and fgrep, by Ian Ashdown. I didn't find the source, but I didn't look all that hard. Given a little time, you might have more luck.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top