python - recursively deleting dict keys?

Question 1

Adding or removing items to/from a sequence while iterating over this sequence is tricky at best, and just illegal (as you just discovered) with dicts. The right way to remove entries from a dict while iterating over it is to iterate on a snapshot of the keys. In Python 2.x, dict.keys() provides such a snapshot. So for dicts the solution is:

for key in mydict.keys():
    if key == bad_value:
        del mydict[key]

As mentionned by cpizza in a comment, for python3, you'll need to explicitely create the snapshot using list():

for key in list(mydict.keys()):
    if key == bad_value:
        del mydict[key]

For lists, trying to iterate on a snapshot of the indexes (ie for i in len(thelist):) would result in an IndexError as soon as anything is removed (obviously since at least the last index will no more exist), and even if not you might skip one or more items (since the removal of an item makes the sequence of indexes out of sync with the list itself). enumerate is safe against IndexError (since the iteration will stop by itself when there's no more 'next' item in the list, but you'll still skip items:

>>> mylist = list("aabbccddeeffgghhii")
>>> for x, v  in enumerate(mylist):
...     if v in "bdfh":
...         del mylist[x]
>>> print mylist
['a', 'a', 'b', 'c', 'c', 'd', 'e', 'e', 'f', 'g', 'g', 'h', 'i', 'i']

Not a quite a success, as you can see.

The known solution here is to iterate on reversed indexes, ie:

>>> mylist = list("aabbccddeeffgghhii")
>>> for x in reversed(range(len(mylist))):
...     if mylist[x] in "bdfh":
...         del mylist[x]
>>> print mylist
['a', 'a', 'c', 'c', 'e', 'e', 'g', 'g', 'i', 'i']

This works with reversed enumeration too, but we dont really care.

So to summarize: you need two different code path for dicts and lists - and you also need to take care of "not container" values (values which are neither lists nor dicts), something you do not take care of in your current code.

def scrub(obj, bad_key="_this_is_bad"):
    if isinstance(obj, dict):
        # the call to `list` is useless for py2 but makes
        # the code py2/py3 compatible
        for key in list(obj.keys()):
            if key == bad_key:
                del obj[key]
            else:
                scrub(obj[key], bad_key)
    elif isinstance(obj, list):
        for i in reversed(range(len(obj))):
            if obj[i] == bad_key:
                del obj[i]
            else:
                scrub(obj[i], bad_key)

    else:
        # neither a dict nor a list, do nothing
        pass

As a side note: never write a bare except clause. Never ever. This should be illegal syntax, really.

Question 2

Here a generalized version of the one of @bruno desthuilliers, with a callable to test against the keys.

def clean_dict(obj, func):
    """
    This method scrolls the entire 'obj' to delete every key for which the 'callable' returns
    True

    :param obj: a dictionary or a list of dictionaries to clean
    :param func: a callable that takes a key in argument and return True for each key to delete
    """
    if isinstance(obj, dict):
        # the call to `list` is useless for py2 but makes
        # the code py2/py3 compatible
        for key in list(obj.keys()):
            if func(key):
                del obj[key]
            else:
                clean_dict(obj[key], func)
    elif isinstance(obj, list):
        for i in reversed(range(len(obj))):
            if func(obj[i]):
                del obj[i]
            else:
                clean_dict(obj[i], func)

    else:
        # neither a dict nor a list, do nothing
        pass

And an example with a regex callable :

func = lambda key: re.match(r"^<div>", key)

clean_dict(obj, func)

Question 3

def walk(d, badvalue, answer=None, sofar=None):
    if sofar is None:
        sofar = []
    if answer is None:
        answer = []
    for k,v in d.iteritems():
        if k == badvalue:
            answer.append(sofar + [k])
        if isinstance(v, dict):
            walk(v, badvalue, answer, sofar+[k])
    return answer

def delKeys(d, badvalue):
    for path in walk(d, badvalue):
        dd = d
        while len(path) > 1:
            dd = dd[path[0]]
            path.pop(0)
        dd.pop(path[0])

Output

In [30]: d = {1:{2:3}, 2:{3:4}, 5:{6:{2:3}, 7:{1:2, 2:3}}, 3:4}

In [31]: delKeys(d, 2)

In [32]: d
Out[32]: {1: {}, 3: 4, 5: {6: {}, 7: {1: 2}}}