Question

I have a very large but lexicographically ordered textfile in which I need to find entries as fast as possible. In order to not traverse the whole thing for a specific entry, I stored certain key positions, such as a-->0, b-->4092 etc, where the character is the first occurrence of itself, and the number is its position, which is obtained by parsing the whole file once with getline() and adding the length of each string into a count-variable. The goal is, that i can jump through the file with seekg(pos) to localize the search a bit. It also seems to work, but sometimes it doesn't and I came here to ask why. The relevant code looks more or less like this:

long pos1 = 10800;
long pos2 = 99725;
ifstream txtFile("path/data.txt");
char temp[200];

txtFile.seekg(pos1, txtFile.beg);
txtFile.getline(temp, 100);
txtFile.getline(temp, 100);
cout << tmp << endl;

txtFile.seekg(pos2, txtFile.beg);
txtFile.getline(temp, 100);
txtFile.getline(temp, 100);
cout << tmp << endl;

The second getline is there just in case the stream jumped to the end of a line. In the first case, there is no output. Just an empty string. In the second case, the output is a normal line from the text file. The file itself contains no empty lines.

I am a bit at a loss. First I thought that maybe the fpos datatype (which is used inside seekg) just is very small and cannot handle numbers above ~10.000, but then I happened to get a valid lookup from the 99.000 range. Has anyone ever had a similar problem?

edit: I just found a possible reason for the problem. In another thread which dealt with seekg, it was advised to reopen the ifstream in order to clear failflags. I did that, and now at least subsequent calls produce something. This tells me, that apparently soemthing goes wrong while calling txtFile.seekg(pos1, txtFile.beg);, but it isn't end of file.

edit 2: I just checked, the failbit is set after a getline call which doesn't get anything.

Was it helpful?

Solution

My bad, I searched the error at the wrong end. My problem wasn't seekg, it was the getline function from ifstream using char[] instead of string, which was a surprise to me. If the array was choosen too small and the deliminating character wasn't found yet, bad things happened.

The problem can be avoided by making sure that the array is as big as it needs to be, or by calling the global getline with string.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top