Question

I feel like almost every time I read a file in Python, what I want is:

with open("filename") as file_handle:
    for line in file_handle:
        #do something

Is this truly the preferred idiom? It mildly irritates me that this double indents all file reading logic. Is there a way to collapse this logic into one line or one layer?

Was it helpful?

Solution

For simple cases, yes, the two-level with and for is idiomatic.

For cases where the indentation becomes a problem, here as anywhere else in Python, the idiomatic solution is to find something to factor out into a function.


You can write wrappers to help this. For example, here's a simple way to solve some of the problems you use with for (e.g., even in the best case, the file sticks around after you finish the loop, until the end of the scope—which could be days later, or never, if the scope is a main event loop or a generator or something…):

def with_iter(iterable):
    with iterable:
        yield from iterable

for line in with_iter(open("filename")):
    # do something

for line in with_iter(open("other_filename")):
    # do something else

Of course it doesn't solve everything. (See this ActiveState recipe for more details.)

If you know that it does what you want, great. If you don't understand the differences… stick to what's idiomatic; it's idiomatic for a reason.


So, how do you refactor the code? The simplest way is often to turn the loop body into a function, so you can just use map or a comprehension:

def do_with_line(line):
    return line

with open("filename") as f:
    process = [do_with_line(line) for line in f]

But if the problem is that the code above or underneath the for is too deep, you'll have to refactor at a different level.

OTHER TIPS

Yes, this is absolutely idiomatic Python.

You shouldn't be bothered too much by multiple levels of indentation. Certainly this is not the only way for it to happen, e.g.

if condition:
    for x in sequence:
        #do something with x

If the level of indentation becomes too great, it's time to refactor into multiple functions. One of the things I love most about Python is that it reduces the friction of breaking things up.

with open("filename") as file_handle:
    result = do_something(file_handle)

In short, no, if you want to maintain the exactly same semantics.

If single indent would irritate you less you can always do:

with open("filename") as file_handle:
    fle = file_handle.read()

But be careful with big files as after slurping whole file it gets into your machine's memory. You can achieve single indent and still be able to iterate kind of line by line if you do:

with open("filename") as file_handle:
    fle = file_handle.readlines()

Lines from your file will be placed in list, each in separate element, and you can then iterate through it like that:

for ln in fle:
    #do something with ln here, it contain one line from your file

Still be careful with big files! As it is all done in memory.

Just to be explicit:

@ myself, of course that's the idiom! The with/for line in idiom provides several benefits:

  • It automatically closes files on errors.
  • It reads in files chunk by chunk, limiting memory use.
  • It is broadly used; other coders will understand it immediately.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top