Question

I m trying to search the content of a fairly large file (5gb) using RandomAcessFile. Using binary search I divided the file into two parts , but such a strategy would work only when I would be able to read the line where the randomacessfile pointer points from the start. Since there is no guarantee that the pointer will point to the start of the file I need a way to position it to the start of the current line where it points. I referred the java docs but could not find a method that seeks the pointer to the start of the file. It does have a readLine() function but that would read the next line.

Was it helpful?

Solution

RandomAccessFile has no concept of "lines", as it is a byte-stream reader/writer, not a character-stream reader/writer. Using RandomAccessFile, your only option is to iterate backward until you find a newline character.

OTHER TIPS

Mon Dec 4 03:46:35 2017 200 459 /challenges/leaderboard/view

Mon Dec 4 03:46:35 2017 200 460 /challenges/leaderboard/view

There are many lines in the file each has a the same format

<time> <resposnetime> <response code> <URL>

Your format is ambiguous, 200 looks like HTTP 200 OK to me, thus the <response code> but there is something else tight before <URL>...

Anyways... One approach would be to index your file so that you know the offset where each line starts. For that you'll have to read the complete file once on start up to build the index. After that you can use RandomAccessFile and its method seek(long) to easily navigate through the lines.

As index you could simply have a List<Long> index;. The offset of line 1 would be index.get(0) and should always be 0. The offset of line 2 would be index.get(1), line 10 would be index.get(9) and so on.

Assuming the lines in your file are sorted, you could use binary search like this

mid = (low+index.size()-1)/2;

// search loop
offset = index.get(mid);
raf.seek(offset);
line = reaf.readLine();
// compare, if not found calc new mid and repeat

I have made a program with a function which takes the file pointer to the start of the current line. You can find the code here. https://github.com/VihaanVerma89/RandomSolutions/blob/master/interviewStreet/Search/src/logFind.java

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top