Tokenizer in java dealing with spaces and apostorophes
-
06-12-2019 - |
Question
I was wondering if there is any way in Java to tokenize an string by spaces, BUT if there are some words between apostrophes, take it as "one word"...
so for example, if I have:
This "is a great" day
the string tokenizer should have:
"This"
"is a great"
"day"
Thanks!
Solution
Using String.split()
and a regex, not a StringTokenizer
, how about:
String input = "this \"is a great\" day";
for (String word: input.split("(?<=\").+(?=\")|\\b\\w+\\b"))
{
System.out.println("["+word+"]");
}
Output:
[this]
[is a great]
[day]
From your example, I assume you mean double-quotes (") not apostrophes (').
NB: I initially posted something much simpler, which worked for your example, but not for input like:
" yes this \"is a great\" day all right"
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow