Question

I need to remove some substrings in strings (in a large dataset). The substrings often contain special characters, like these: ., ^, /,... and replaceAll() would treat them as special characters for regex, such as a dot would match any character, which is not really what I want.

Is there other functions to do the "replace" without treating the first argument as regex?

Was it helpful?

Solution

Just use String.replace(). It functions the same way, but it deals with escaping the special characters internally to avoid you having to worry about regex.

Documentation

OTHER TIPS

You can match literally. For instance, if we want to match "<.]}^", we can do:

Pattern pat=Pattern.compile("<.]}^", PATTERN.LITERAL");

and use that pattern.

You can also use backslashes to escape it. Note that the string literal itself needs backslashes, so escaping a single dot will take two backslashes, as follows:

Pattern pat=Pattern.compile("\\.");

First backslash is seen by compiler, and second backslash is taken as a backslash for the regex parser.

Just use String.replace(String, String), not replaceAll. String.replace doesn't treat its argument as a regex.

There are 2 methods named replace in the String class that perform replacement without treating their parameters as regular expressions.

One replace method replaces one char with another char.

The other replace method replaces a CharSequence (usually a String) with another CharSequence.

Quoting the Javadocs from the second replace method:

Replaces each substring of this string that matches the literal target sequence with the specified literal replacement sequence.

Is there other functions to do the "replace"

Yes, it is called replace :) Main difference between it and replaceAll is that it escapes regex special characters.


BTW if you want to escape regex's special characters in string you can

  • use yourString = Pattern.quote(yourString),
  • surround it with "\\Q" and "\\E",

to escape only some special characters you can

  • use "\\" before them like \\.
  • also most special characters can be escaped by surrounding them with "[" and "]" like [.].
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top