Question

How can I divide a sentence like "He and his brother playing football." into few part like "He and", "and his", "his brother", "brother playing" and "playing football" . Is it possible to do that by using Java?

Was it helpful?

Solution

Assuming the "words" are always separated by a single space. Use String.split()

String[] words = "He and his brother playing football.".split("\\s+");
for (int i = 0, l = words.length; i + 1 < l; i++)
        System.out.println(words[i] + " " + words[i + 1]);

OTHER TIPS

You can do it using BreakIterator class and its static method getSentenceInstance(). It Returns a new BreakIterator instance for sentence breaks for the default locale.

You can also use getWordInstance(), getLineInstance().. to break words, line...etc

eg:

BreakIterator boundary = BreakIterator.getSentenceInstance();

boundary.setText("Your_Sentence");

int start = boundary.first();

int end = boundary.next();

Iterate over it... to get the Sentences....

For more detail look at this link:

http://docs.oracle.com/javase/6/docs/api/java/text/BreakIterator.html

Edited Answer: This is a working code

String sent = "My name is vivek. I work in TaxSmart";
        BreakIterator bi = BreakIterator.getSentenceInstance();
        bi.setText(sent);
        int index = 0;
        while (bi.next() != BreakIterator.DONE) {
        String sentence = sent.substring(index, bi.current());
        System.out.println("Sentence: " + sentence);
        index = bi.current();
        }
String str="He and his brother playing football";

    String [] strArray=str.split(" ");
    for(int i=0;i<strArray.length-1 ;i++)
    {
        System.out.println(strArray[i]+" "+strArray[i+1]);
    }

Use a StringTokenizer to separate by spaces or other characters.

import java.util.StringTokenizer;

public class Test {

         private static String[] tokenize(String str) {
            StringTokenizer tokenizer = new StringTokenizer(str);
        String[] arr = new String[tokenizer.countTokens()];
        int i = 0;
        while (tokenizer.hasMoreTokens()) {
        arr[i++] = tokenizer.nextToken();
        }
        return arr;
     }

    public static void main(String[] args) {
        String[] strs = tokenize("Sandy sells seashells by the sea shore.");
        for (String s : strs)
            System.out.println(s);
    }
}

Should print out:

Sandy

sells

seashells

by

the

sea

shore.

May or may not be what you're after.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top