Question

I have the following piece of code in Java:

HttpURLConnection con = (HttpURLConnection)new URL(url).openConnection();
con.connect();
InputStream stream = con.getInputStream();
BufferedReader file = new BufferedReader(new InputStreamReader(stream));

At this point, I read the file from start to end while searching for something:

while (true)
{
    String line = file.readLine();
    if (line == null)
        break;
    // Search for something...
}

Now I want to search for something else within the file, without opening another URL connection.

For reasons unrelated to this question, I wish to avoid searching for both things "in a single file-sweep".

Questions:

  1. Can I rewind the file with reset?

  2. If yes, should I apply it on the InputStream object, on the BufferedReader object or on both?

  3. If no, then should I simply close the file and reopen it?

  4. If yes, should I apply it on the InputStream object, on the BufferedReader object or on both?

  5. If no, how else can I sweep the file again, without reading through the URL connection again?

Was it helpful?

Solution

You can rewind the file with reset(), provided that you have mark()'ed the position you want to rewind to. These methods should be invoked on the decorator, i.e. BufferedReader.

However, you may probably want to reconsider your design as you can easily read the whole file into some data structure (even a list of strings, or some stream backed by a string) and use the data multiple times.

OTHER TIPS

Use the following methods:

  • mark
  • skip
  • reset

You can do it only if markSupported() returns true. Please note that actually reader typically does not add this functionality but delegates it to wrapped intput stream, so always call markSupported() and keep in mind that it can return false for streams that do not support this feature.

For example it really can happen for URL based streams: think, how can you reset stream that is originated from remote server. This may require client side to cache all content that you have already downloaded.

I usually end up using something like InputStreamSource to make re-reading convenient. When I'm dealing with connections, I find it useful to use an in-memory or on-disk spooling strategy for re-reading. Use a threshold for choosing storage location, "tee" into the spool on first read, and re-read from the spool on subsequent reads.

Edit: Also found guavas ByteSource and CharSource which have the same purpose.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top