python 3.0 readline() random jump

https://stackoverflow.com/questions/16592185

29-05-2022
|

Question

I am reading a txt file line by line using python 3 built-in readline() function. This file contains employees information in blocks and it looks like this:

First name Jack \n
Last name Garcia \n
Manager name Smith \n
Description this is the description of the employee \n
bla bla bla bla \n
bla bla bla bla \n
bla bla bla bla. \n
Salary 25000\n

My code looks like this:

with open(os.path.join(INPUT_FOLDER, filename)) as input_file:
    for line in input_file:
        if line.upper().startswith('DESCRIPTION'):
            description = line.split('DESCRIPTION')[1].strip()
            line = input_file.readline()
            while not line.upper().startswith('SALARY'):
                ...

I get the expected value in description variable but when the input_file.readline() statement is executed, it jumps 5 lines farther!! So I can't complete the rest of the description properly. What makes me mad is that I have read other employees information blocks previously in the same file and everything works correctly.

I am executing the script under Eclipse using pydev 2.7.1

Has anyone deal with a similar problem? Is it related with the IDE, python version, ...?

Thank you in advance.

Solution

You can't mix file iteration and readline(). The Built-In Types doc for file.next() says:

"In order to make a for loop the most efficient way of looping over the lines of a file (a very common operation), the next() method uses a hidden read-ahead buffer. As a consequence of using a read-ahead buffer, combining next() with other file methods (like readline()) does not work right. However, using seek() to reposition the file to an absolute position will flush the read-ahead buffer."

with open(os.path.join(INPUT_FOLDER, filename)) as input_file:
    while True:
        line = input_file.readline()
        if not line:
            break
        if line.upper().startswith('DESCRIPTION'):
            description = line.split('DESCRIPTION')[1].strip()
            line = input_file.readline()
            while not line.upper().startswith(SALARY):

OTHER TIPS

Your are comparing on uppercase but spliting on unmodified case :

line.split('DESCRIPTION')

Also, this

line = input_file.readline()
while not line.upper().startswith(SALARY):

does not seem to be appended to your description variable.

Your probably need to add

line = input_file.readline()
while not line.upper().startswith(SALARY):
    description += line
    line = input_file.readline()

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow