Question

I'm filtering the directory components using os.walk():

exclude_dirs = ['a', 'b']
for root, dirs, files in os.walk(mytopdir):
  dirs[:] = [d for d in dirs if d not in exclude_dirs]  # 1. Works
  dirs = [d for d in dirs if d not in exclude_dirs]     # 2. Doesn't work

It seems like the second one is making a new local variable which hides the original dirs. How is the first one doing things that it avoids this?

Was it helpful?

Solution

dirs[:] = ... modifies dirs inplace. dirs = ... reassigns the variable dirs to a new object. The directories visited by os.walk is affected only if the object that dirs originally refers to is modified inplace.

dirs[:] = ... is a form of slice assignment.

In [18]: dirs = range(10)

In [19]: dirs
Out[19]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [20]: id(dirs)
Out[20]: 158391724

This slice assignment replaces the values in dirs[5:8] with the characters in 'hello'. Notice that the number of items in the slice (3) do not need to equal the number of items in the assignment (5).

In [21]: dirs[5:8] = 'hello'

In [22]: dirs
Out[22]: [0, 1, 2, 3, 4, 'h', 'e', 'l', 'l', 'o', 8, 9]

The id does not change:

In [23]: id(dirs)
Out[23]: 158391724

When start and stop slice indices are omitted, the slice is taken to be the entire list:

In [24]: dirs[:] = 'cheese'

In [25]: dirs
Out[25]: ['c', 'h', 'e', 'e', 's', 'e']

Notice that again the id does not change. That's an indication that dirs points to the same object, and the modification was done inplace.

In [26]: id(dirs)
Out[26]: 158391724

In contrast, if you reassign dirs to some other value, then the id changes, because it is now pointing at a different object.

In [27]: dirs = 'spam'

In [28]: id(dirs)
Out[28]: 181415008
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top