Question

I split this String by space : String input = ":-) :) :o) :] :3 :c) :> =] 8) =) :} :^)"; (space between emoticons)

And result is:

:-)?:)?:o)?:]?:3?:c)?:>
=]

8)

=)?:}?:^)

There are some strange characters in the results. I don't know why. Please help me.

Here is the code:

fileReader = new BufferedReader(new FileReader("emoticon.txt"));
String line = "";
while ((line = fileReader.readLine()) != null){
    String[] icons = parts[0].split("\\s+");
    ....
}

Thank for any advices. Here is emoticon file:
https://www.dropbox.com/s/6ovz0aupqo1utrx/emoticon.txt

Was it helpful?

Solution

String input = ":-) :) :o) :] :3 :c) :> =] 8) =) :} :^)";
String[] similies = input.split(" ");
for (String simili : similies) {
    System.out.println(simili);
}

This works fine. Output :

:-)
:)
:o)
:]
:3
:c)
:>
=]
8)
=)
:}
:^)

and in case if there is any tab/newline/spaces and you wnat to split, in that case you can use

input.split("\\s+"); 

in your example there is few more charaters are their like  and non breaking whitespaces so you have to explicitly handle these type of charater. Here is the code:

public static void main(final String[] args) throws Exception {
    BufferedReader fileReader = new BufferedReader(new FileReader("emoticon.txt"));
    String line = "";
    while ((line = fileReader.readLine()) != null) {
        line = line.replaceAll("Â", "");
        line = line.replace("" + ((char) 160), " ");
            System.out.println("line: " + line);
        String[] icons = line.split("\\s+");
        for (String icon : icons) {
            System.out.println(icon);
        }
        System.out.println("=======================");
    }
}

OTHER TIPS

They may not be just space characters; they could be tabs etc.

Instead, try splitting on whitespace characters (regex \s), rather than just specifically space characters:

String[] emoticons = input.split("\\s+");

I analysed the file referred to in the comment and found that some of the "spaces" were actually characters with decimal value 160 (hex A0). By changing the split regex to include this character, I was able to split every emoticon:

String[] emoticons = input.split("[\\s\u00A0]+");

Since you are seeing newlines in your output, it could mean that the original string that you have as input may consist of whitespace as newline, tabs, etc. apart from space.

So, you need to split the string on whitespace:

String[] spiltted = input.split("\\s+");

You have to pass a regular expression.

split

Try with

String[] array = input.split("\\s+");
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top