문제

I am parsing through a directory. I found a snippet here, that works well, but I cannot seem to figure out why and how their variable dir is updated where it is set.

What I am trying to do is leave out empty folders

import os

def get_directory_structure(rootdir):
    """
    Creates a nested dictionary that represents the folder structure of rootdir
    """
    dir = {}
    rootdir = rootdir.rstrip(os.sep)
    start = rootdir.rfind(os.sep) + 1
    for path, dirs, files in os.walk(rootdir):
        folders = path[start:].split(os.sep)
        subdir = dict.fromkeys(files)
        parent = reduce(dict.get, folders[:-1], dir)
        parent[folders[-1]] = subdir
    return dir

dir is being set to the same value as parent along the line:

        parent[folders[-1]] = subdir

How come?

dir is mutable and taken as input in the reduce line, but it is not set there, rather at the following line.

Any idea?

I want to be able to leave out the empty folders, and would rather find an elegant way to do it; Should I give up and skim through the dict as a second pass?

[Edit after solved] so as Hans and Adrin pointed out, reduce actually makes parent point to dir, so they are the same object, and any update to parent updates dir.

I ended up keeping the same code but renamed the vars for clarity:

dir -> token_dict
folders -> path_as_list
subdir -> files_in_dir
parent -> full_dir (and I end up returning full_dir)

More typing, but next time I look, I'll get to it straight away.

도움이 되었습니까?

해결책 2

You're passing dir to the reduce function. Meaning, you're passing a pointer to the object to the function, and the function can change it.

Look at the implementation of the reduce function here. And note the line:

accum_value = function(accum_value, x)

At this point, accum_value is pointing to the same place as initializer which is your dir, and is passed to the function, which in your case is dict.get.

다른 팁

Little bit of explanation about reduce with dictionary for anybody who are not much familiar with reduce:

Before we come to the snippet lets do a little bit of reduce function.

Reduce will apply a function of two arguments cumulatively to the items of a sequence, from left to right, so as to reduce the sequence to a single value.

Here is the syntax:

reduce(function, sequence[, initial]) -> value

If initial is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.

Without initial:

>>> reduce(lambda x, y: x+y, [1, 2, 3, 4, 5])
15
>>>
smiliar to ((((1+2)+3)+4)+5)

With initial:

>>> reduce(lambda x, y: x+y, [], 1) 
1
>>>

That is about list, when it comes to dictionary:

First lets check what is dict.get() method can do :

>>> d = {'a': {'b': {'c': 'files'}}}
>>> dict.get(d,'a')
{'b': {'c': 'files'}}
>>>

So, when you put dict.get method inside reduce, this is what happens:

>>> d = {'a': {'b': {'c': 'files'}}}
{'b': {'c': 'files'}}
>>> reduce(dict.get, ['a','b','c'], d)
'files'
>>>

Which is similar to :

>>> dict.get(dict.get(dict.get(d,'a'),'b'),'c')
'files'
>>>

and when you got empty list, you will get empty dict which is the default value:

>>> reduce(dict.get, [], {})
{}
>>>

Lets come back to your snippet:

dir in your snippet != builtin dir() function, it is just a name bind to an empty dictionary.

parent = reduce(dict.get, folders[:-1], dir)

So, in the above line, folders[:-1] is just a list of directories. and dir is empty_dictionary.

Please let me know if it helps in anyway.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top