Strange behaviours of the String's contains and replaceAll methods with special characters

StackOverflow https://stackoverflow.com/questions/9839232

Pergunta

I did a little research with the String's contains and replaceAll methods.

char c = '*';

String str = "1220"+c+""+c+""+c+""+c+""+c+"23";
System.out.println(str.contains(c+""));
System.out.println(str.contains("["+c+"]"));
System.out.println(str.contains("\\"+c));


System.out.println(str.replaceAll("["+c+"]", "X"));
System.out.println(str.replaceAll("\\"+c, "X"));
System.out.println(str.replaceAll(c+"", "X"));

Results : When c = '*' or '^' or '+'

true
false
false
1220XXXXX23
1220XXXXX23
java.util.regex.PatternSyntaxException

When c = '#' or '~' or '%' or '<' or '>' or '=' or '&' or '@' or '-' or '!'

true
false
false
1220XXXXX23
1220XXXXX23
1220XXXXX23

When c = '$'

true
false
false
1220XXXXX23
1220XXXXX23
1220$$$$$23X

when c = '|'

true
false
false
1220XXXXX23
1220XXXXX23
X1X2X2X0X|X|X|X|X|X2X3X

I am wondering about what is the theory / rule behind this?

Foi útil?

Solução

The argument of contains and the first argument of replaceAll are interpreted differently: the former is just a character sequence, while the later is a regular expression. Since * is a meta-character of the Java's regexp language that cannot appear unescaped on its own (it must follow an expression being repeated zero or more times in the match), it is treated differently by the two methods.

Outras dicas

str.replaceAll(...) uses regular expression as first argument. Characters *,$,^,+,? and others are part of regular expression syntax. See link for details about how they are treated by replaceAll.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top