Weirdness of itertools.groupby in Python when realizing the groupby result early [duplicate]

StackOverflow https://stackoverflow.com/questions/22706606

  •  23-06-2023
  •  | 
  •  

Pergunta

First, apologies for my poor description of the problem. I can't find a better one.

I found that applying list to an itertools.groupby result will destroy the result. See code:

import itertools
import operator

log = '''\
hello world
hello there
hi guys
hi girls'''.split('\n')

data = [line.split() for line in log]

grouped = list(itertools.groupby(data, operator.itemgetter(0)))

for key, group in grouped:
    print key, group, list(group)

print '-'*80

grouped = itertools.groupby(data, operator.itemgetter(0))

for key, group in grouped:
    print key, group, list(group)

The result is:

hello <itertools._grouper object at 0x01A86050> []
hi <itertools._grouper object at 0x01A86070> [['hi', 'girls']]
--------------------------------------------------------------------------------
<itertools.groupby object at 0x01A824E0>
hello <itertools._grouper object at 0x01A860B0> [['hello', 'world'], ['hello', 'there']]
hi <itertools._grouper object at 0x01A7DFF0> [['hi', 'guys'], ['hi', 'girls']]

Probably this is related to the internal working of the groupby function. Nevertheless it surprised me today.

Foi útil?

Solução

This is documented:

The returned group is itself an iterator that shares the underlying iterable with groupby(). Because the source is shared, when the groupby() object is advanced, the previous group is no longer visible.

When you do list(groupby(...)), you advance the groupby object all the way to the end, this losing all groups except the last. If you need to save the groups, do as shown in the documentation and save each one while iterating over the groupby object.

Outras dicas

The example in the documentation is not as nice as:

list((key, list(group)) for key, group in itertools.groupby(...))

in turning the iterator into a list of tuples of keys and lists of groups: [(key,[group])] if that is what is desired.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top