It's not always a good idea (or perhaps even a feasible one) to usereadlines()
without an argument because it will read in the entire file and potentially consume a lot of memory—and doing that may not be necessary if you don't need the all of them at once, depending on exactly what you're doing.
So, one way to do what you want is to use a Python generator function to extract just the lines or values you need from a file. They're very easy to create, essentially you just useyield
statements to return values instead ofreturn
. From a programming point-of-view the main difference between them is that execution will continue with the line following theyield
statement next time the function is called, rather than from it first line as would normally be the case. This means their internal state automatically gets saved between subsequent calls, which makes doing complicated processing inside them easier.
Here's a fairly minimal example of using one to get the just the data you want out of the file, incrementally one-line-at-a-time so it doesn't require enough memory to hold the whole file:
def read_data(filename):
with open(filename, 'rt') as file:
next(file); next(file) # ignore first two lines
value = next(file).rstrip('\n') # read what should be the first number
while value != '#extra': # not end-of-numbers marker
yield value
value = next(file).rstrip('\n')
for number in read_data('mydatafile'):
# process each number string produced
Of course you can still gather them all together into a list, if you wish, like this:
numbers = list(read_data('mydatafile'))
As you can see it's possible to do other useful things in the function, such as validating the format of the file data or preprocessing it in other ways. In the example above I've done a little of that by removing the newline charactersreadlines()
leaves on each line of the list it returns. It would be trivial to also convert each string value into an integer by usingyield int(value)
instead of justyield value
.
Hopefully this will give you enough of an idea of what's possible and the trade-offs involved when deciding on what approach to use to perform the task at hand.