Using the file as an iterator (such as calling next()
on it or using it in a for
loop) uses an internal buffer; the actual file read position is further along the file and using .tell()
will not give you the position of the next line to yield.
If you need to seek back and forth, the solution is not to use next()
directly on the file object but use file.readline()
only. You can still use an iterator for that, use the two-argument version of iter()
:
fileobj = open(filename)
fh = iter(fileobj.readline, '')
Calling next()
on fileiterator()
will invoke fileobj.readline()
until that function returns an empty string. In effect, this creates a file iterator that doesn't use the internal buffer.
Demo:
>>> fh = open('example.txt')
>>> fhiter = iter(fh.readline, '')
>>> next(fhiter)
'foo spam eggs\n'
>>> fh.tell()
14
>>> fh.seek(0)
0
>>> next(fhiter)
'foo spam eggs\n'
Note that your enumerate
chain can be simplified to:
items = itertools.chain(enumerate(fh, start=1), (None,))
although I am in the dark why you think a (None,)
sentinel is needed here; StopIteration
will still be raised, albeit one more next()
call later.
To read specialLines
count lines, use itertools.islice()
:
for lino, eline in islice(items, specialLines):
# etc. get the special data I need here
You can just loop directly over fh
instead of using an infinite loop and next()
calls here too:
with open(filename) as fh:
enumerated = enumerate(iter(fileobj.readline, ''), start=1):
for lino, line in enumerated:
# handle special section
if line.startswith['SPECIAL']:
start = fh.tell()
for lino, eline in islice(items, specialLines):
# etc. get the special data I need here
fh.seek(start)
but do note that your line numbers will still increment even when you seek back!
You probably want to refactor your code to not need to re-read sections of your file, however.