Error whilst using StringTokenizer on text file with multiple lines

https://stackoverflow.com/questions/5856125

27-10-2019
|

Question

I'm trying to read a text file and split the words individually using string tokenizer utility in java.

The text file looks like this;

Now, what I'm trying to do is get each individual character from the text file and store it into an array list. I then try and print every element in the arraylist in the end.

Here is my code;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.StringTokenizer;

public static void main(String[] args) {

    String fileSpecified = args[0];

    fileSpecified = fileSpecified.concat(".txt");
    String line;
    System.out.println ("file Specified = " + fileSpecified);

    ArrayList <String> words = new ArrayList<String> ();


    try {
        FileReader fr = new FileReader (fileSpecified);
        BufferedReader br = new BufferedReader (fr);
        line = br.readLine();

        StringTokenizer token;
        while ((line  = br.readLine()) != null) {
            token = new StringTokenizer (line);
            words.add(token.nextToken());
        }
    } catch (IOException e) {
        System.out.println (e.getMessage());
    }

    for (int i = 0; i < words.size(); i++) {
        System.out.println ("words = " + words.get(i));
    }



}

The error message I get is this;

Exception in thread "main" java.util.NoSuchElementException   
                at java.util.StringTokenizer.nextToken<Unknown Source>  
                at getWords.main<getWords.java:32>

Where 'getWords' is the name of my java file.

Thankyou.

Solution

a) You always have to check StringTokenizer.hasMoreTokens() first. Throwing NoSuchElementException is the documented behaviour if no more tokens are available:

token = new StringTokenizer (line);
while(token.hasMoreTokens())
    words.add(token.nextToken());

b) don't create a new Tokenizer for every line, unless your file is too large to fit into memory. Read the entire file to a String and let the tokenizer work on that

OTHER TIPS

Your general approach seems sound, but you have a basic problem in your code.

Your parser is most likely failing on the second line of your input file. This line is a blank line, so when you call words.add(token.nextToken()); you get an error, because there are no tokens. This also means you'll only ever get the first token on each line.

You should iterate on the tokes like this:

while(token.hasMoreTokens())
{
    words.add(token.nextToken())
}

You can find a more general example in the javadocs here:

http://download.oracle.com/javase/1.4.2/docs/api/java/util/StringTokenizer.html

This problem is due to the fact that you don't test if there is a next token before trying to get the next token. You should always test if hasMoreTokens() before returns true before calling nextToken().

But you have other bugs :

The first line is read, but not tokenized
You only add the first word of each line to your list of words
bad practice : the token variable should be declared inside the loop, and not outside
you don't close your reader in a finally block

You need to use hasMoreTokens() method. Also addressed various coding standard issues in your code as pointed out by JB Nizet

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.StringTokenizer;

public class TestStringTokenizer {

    /**
     * @param args
     * @throws IOException 
     */
    public static void main(String[] args) throws IOException {
        String fileSpecified = args[0];

        fileSpecified = fileSpecified.concat(".txt");
        String line;
        System.out.println ("file Specified = " + fileSpecified);

        ArrayList <String> words = new ArrayList<String> ();

        BufferedReader br =  new BufferedReader (new FileReader (fileSpecified));
        try{
            while ((line  = br.readLine()) != null) {
                StringTokenizer token = new StringTokenizer (line);
                while(token.hasMoreTokens())
                    words.add(token.nextToken());
            }
        } catch (IOException e) {
            System.out.println (e.getMessage());
            e.printStackTrace();
        } finally {
            br.close();
        }

        for (int i = 0; i < words.size(); i++) {
            System.out.println ("words = " + words.get(i));
        }
    }
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow