Question

I have some text files that contain multiple punctuation marks, so I need to reduce those to single punctuation marks.

Here is some sample text:

They are working in London..... he is a Java developer !!!!! they are playing------ She is working_______

This is the required output:

They are working in London.he is a Java developer !they are playing- She is working_

I need some help with the Java regex.

Thanks

Was it helpful?

Solution

Use backreference (\1+) to match repeated character.

Try following:

String text = "They are working in London..... he is a Java developer !!!!! they are playing------ ---- ---- She is working_______";
String replaced = text.replaceAll("(?:([-.!_])\\1+\\s*)+", "$1");
System.out.println(replaced);

prints

They are working in London.he is a Java developer !they are playing-She is working_

OTHER TIPS

You can try this

   String str = "They are working in London..... he is a Java developer !!!!! they are playing-----She is working_______";
   String newStr = str.replaceAll("([|\\-|\\.|\\!|\\_])\\1+", "$1");
   System.out.println(newStr);

Live Demo

Out put

They are working in London. he is a Java developer ! they are playing-She is working_

Try something like this:

/([.;,?!-_]){2,}/$1/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top