Question

I am fairly new to using regex with java. My motive is to escape all occurrences of '*' with a back slash. This was the statement that I tried:

String replacementStr= str.replaceAll("(?=\\[*])", "\\\\");

This does not seem to work though. After some amount of tinkering, found out that this works though.

String replacementStr= str.replaceAll("(?=[]\\[*])", "\\\\");

Based on what I know of regular expressions, I thought '[]' represents an empty character class. Am I missing something here? Can someone please help me understand this?

Note: The motive of my trial was to learn to use the lookahead feature of regex. While the purpose stated in the question does not warrant the use of lookahead, am just trying to use it for educational purposes. Sorry for not making that clear!

Was it helpful?

Solution

When some metacharacters are placed within brackets, no need to escape.

In another way, I do not know if you mean escape * with \*. In that case, try the next:

String newStr = str.replace("*", "\\*");

EDIT: There is something curious in your regular expressions.

  • (?=\[*]) Look ahead for the character [ (0 or more times), followed by ]

Regular expression visualization

  • (?=[]\[*]) Look ahead for one of the next characters: [, ], *

Regular expression visualization

Perhaps the regex that you are looking for is the following:

(?=\*)

In Java, "(?=\\*)"

OTHER TIPS

In your replaceAll("(?=\\[*])", "\\\\"); simply modify as

String newStr = str.replace("*", "\\");

Dont bother about regex

For example

String str = "abc*123*";
String newStr = str.replace("*", "\\");
System.out.println(newStr);

Shows output as

abc\123\

Know about String replace

Below Code will work

Code

String strTest = "jhgfg*gfb*gfhh";
strTest = strTest.replaceAll("\\*", "\\\\"); //   strTest = strTest.replace("*", "\\");
System.out.println("String is : "+strTest);

OUTPUT

String is : jhgfg\gfb\gfhh

If the regex engine finds [], it treats the ] as a literal ]. This is never a problem because an empty character class is useless anyway, and it means you can avoid some character escaping.

There are a few rules for characters you don't have to escape in character classes:

  • in [] (or [^]), the ] is literal
  • in [-.....] or [^-.....] or [.....-] or [^.....-], the - is literal
  • ^ is literal unless it is at the start of the character class

So you'll never need to escape ], - or ^ if you don't want to.

This is down to the Perl origins of the regex syntax. It's a very Perl-style way of doing things.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top