Mathematica StringReplace to replace a substring containing newlines
-
22-08-2019 - |
Question
I have something like the following in a string:
blah blah
BEGINIGNORE
this stuff should get stripped out
ENDIGNORE
more stuff here
I would like to do this (perl syntax): s/BEGINIGNORE.*ENDIGNORE//s -- namely, strip out everything between BEGINIGNORE and ENDIGNORE, inclusive. You would think the following would do that in Mathematica:
StringReplace[str, re["BEGINIGNORE[.\\s]*ENDIGNORE"]->""]
But it doesn't. How do I do this in Mathematica?
PS: I define the following alias: re = RegularExpression;
Solution 3
Insert the (?s)
modifier in the regex. That's equivalent to Perl's /s
modifier and is part of standard PCRE syntax.
StringReplace[str, re["BEGINIGNORE(?s).*ENDIGNORE"]->""]
More details in this answer to a related question: Bug in Mathematica: regular expression applied to very long string
OTHER TIPS
It turns out that for some reason "[.\\s]
" and "[.\\n]
" don't work but "(.|\\n)
" does. So the following works:
strip[s_String] := StringReplace[s, re@"BEGINIGNORE(.|\\n)*ENDIGNORE" -> ""]
Try:
StringReplace[str, re["BEGINIGNORE(.|\\n)*ENDIGNORE"]->""]
As you followed up, you need parens rather than square brackets around the expression that you wanted to *
.
The square brackets define a character class here, as in most regular expression languages. That's why [.\\s]
isn't working as you expected, it stands for a set of characters rather than a parenthesized expression. Maybe the Mathematica use of [] for expressions got you thinking in that direction?