Not sure if java supports this, but have a try with:
"сказочники".replaceAll("\P{wd}+", "")
where \P{wd}
stands for any non-word character in any language. It is the opposite of \p{wd}
Question
I need to remove all punctuation from words in java i tried this
System.out.println("do.,it".replaceAll("[^\\w]", ""));
System.out.println("сказочники".replaceAll("[^\\w]", ""));
But it won't work with kyrillic or other languages. I already tried to work with
\p{Punct}
But the list is not complete, for example
„ and »
Are missing
La solution
Not sure if java supports this, but have a try with:
"сказочники".replaceAll("\P{wd}+", "")
where \P{wd}
stands for any non-word character in any language. It is the opposite of \p{wd}
Autres conseils
Try with this regex.
text = text.replaceAll("[^a-zA-Z0-9\\s]", "");
This will remove all special characters except space.
Edit:
As this is a different language.
Suppose you have to remove - + ^ . : ,
Try this, text = text.replaceAll("[\\-\\+\\.\\^:,]","");
my solution seems to be
System.out.println("сказ очники»»«„“‚‘›‹".replaceAll("[^\\p{L}]", ""));