Question

I have something like the following in a string:

blah blah

BEGINIGNORE
   this stuff should get stripped out
ENDIGNORE

more stuff here

I would like to do this (perl syntax): s/BEGINIGNORE.*ENDIGNORE//s -- namely, strip out everything between BEGINIGNORE and ENDIGNORE, inclusive. You would think the following would do that in Mathematica:

StringReplace[str, re["BEGINIGNORE[.\\s]*ENDIGNORE"]->""]

But it doesn't. How do I do this in Mathematica?

PS: I define the following alias: re = RegularExpression;

Was it helpful?

Solution 3

Insert the (?s) modifier in the regex. That's equivalent to Perl's /s modifier and is part of standard PCRE syntax.

StringReplace[str, re["BEGINIGNORE(?s).*ENDIGNORE"]->""]

More details in this answer to a related question: Bug in Mathematica: regular expression applied to very long string

OTHER TIPS

It turns out that for some reason "[.\\s]" and "[.\\n]" don't work but "(.|\\n)" does. So the following works:

strip[s_String] := StringReplace[s, re@"BEGINIGNORE(.|\\n)*ENDIGNORE" -> ""]

Try:

StringReplace[str, re["BEGINIGNORE(.|\\n)*ENDIGNORE"]->""]

As you followed up, you need parens rather than square brackets around the expression that you wanted to *.

The square brackets define a character class here, as in most regular expression languages. That's why [.\\s] isn't working as you expected, it stands for a set of characters rather than a parenthesized expression. Maybe the Mathematica use of [] for expressions got you thinking in that direction?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top