Question

I am reading contents from files of a directory. I have to segregate the files according to their names and then read their contents. When I run the code simply without reading the contents, all the files are being listed of a particular file name, but when I try to read the contents, it reads contents from only few files, in fact just 10 of them. But the directory has about 1000 files of a particular name. I am posting the code here.

for (i = 0; i <= filenames.length; i++) {
    read = new FileReader("trainfiles/"+filenames[i]);          
    br = new BufferedReader(read);

    if (filenames[i].matches(".*ham.*")) {
        System.out.println("ham:" + filenames[i]);
        while ((lines = br.readLine()) != null) {
            st = new StringTokenizer(lines);
            while (st.hasMoreTokens()) {
                System.out.println(st.nextToken());
            }
        }
        br.close();
    }
}

Could anyone tell me where am i doing wrong!?
thanks

EDIT #1 I did some modifications which I have been told here, but the problem still persists, here is the code.

for(i=0;i<=filenames.length;i++){
            read = new FileReader("trainfiles/"+filenames[i]);

            br = new BufferedReader(read);

            if(filenames[i].matches(".*ham.*")){
                System.out.println("ham:"+filenames[i]);

                        while((lines = br.readLine())!= null){
                            st = new StringTokenizer(lines);
                            while(st.hasMoreTokens()){
                                System.out.println(st.nextToken());
                            }

                        }

            }
            br.close();
            read.close();




                        }

EDIT #2 Now the code looks like this, but again...its not giving me the result I want.

for (i = 0; i < filenames.length; i++) {
               try {


                if (filenames[i].matches(".*ham.*")) {
                     read = new FileReader("trainfiles/"+filenames[i]);          
                        br = new BufferedReader(read);
                    System.out.println("ham:" + filenames[i]);
                    while ((lines = br.readLine()) != null) {
                        st = new StringTokenizer(lines);
                        while (st.hasMoreTokens()) {
                            System.out.println(st.nextToken());
                        }
                    }
                }
               } finally {

                read.close();
                br.close();
               }
            }
Was it helpful?

Solution

I would re-write your code like this, and see what output you get:

for (filename : filenames) {
   if (filename.matches(".*ham.*")) {
      System.out.println("ham:" + filename);

      // reset these to null (where are they declared?)
      read = null;   
      br = null;   
      try {
         read = new FileReader("trainfiles/"+filename);          
         br = new BufferedReader(read);

         while ((lines = br.readLine()) != null) {
            System.out.println(lines);
            // st = new StringTokenizer(lines);
            // while (st.hasMoreTokens()) {
            //    System.out.println(st.nextToken());
            // }
         }
      } catch (Exception e) {
         e.printStackTrace();
      } finally {
         if (br != null) br.close();
         if (read != null) read.close();
      }
   } 
}

Some general comments on your original code:

  1. Only use a for loop if you actually need the array index. Prefer a for-each loop (i.e. for (filename : filenames) ...).

  2. Declare variables in the narrowest scope possible. In this case, you should declare your read and br variables where I initialize them to null.

  3. Never open a file unless you're going to use it. Here, that means opening it inside the conditional block.

  4. Since opening a file can throw an exception, br may not get initialized, in which case you can't close it. You need to check for null first.

OTHER TIPS

First of all you should use i<filenames.length. Second, matches expects a regular expression, not *-globs. The expression you used is a valid regular expression for [something]ham[something] - is that what you meant?

I don't think you need to close the Filereader - I think BR's close propagates up. But that's worth checking. EDIT as was mentioned, you need to always close the file, outside the if.

You should close your FileReader object read as well.

Unless this is homework, I would also suggest you take a look at commons-io.

EDIT #1: I would suggest doing both close operations in a finally block.

EDIT #2: Did you try this?

for (i = 0; i <= filenames.length; i++) {
   try {
    read = new FileReader("trainfiles/"+filenames[i]);          
    br = new BufferedReader(read);

    if (filenames[i].matches(".*ham.*")) {
        System.out.println("ham:" + filenames[i]);
        while ((lines = br.readLine()) != null) {
            st = new StringTokenizer(lines);
            while (st.hasMoreTokens()) {
                System.out.println(st.nextToken());
            }
        }
    }
   } finally {
    br.close();
    read.close();
   }
}

1000+ files are a lot of files to read. If it can't read a file it should throw an exception (IOException to be specific). Maybe print the exception message in the catch block and paste it here.

I don't know the StringTokenizer class but does the code give errors when you just print the line without the StringTokenizer?

An other option is to use threads. You have the array of files and then you start some threads who reads a file (producer/consumer problem).

By the way, you can filter files with the class FileFilter.

http://download.oracle.com/javase/1.4.2/docs/api/java/io/File.html#listFiles%28java.io.FileFilter%29

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top