Question

I have this:

>>> sum( i*i for i in xrange(5))

My question is, in this case am I passing a list comprehension or a generator object to sum ? How do I tell that? Is there a general rule around this?

Also remember sum by itself needs a pair of parentheses to surround its arguments. I'd think that the parentheses above are for sum and not for creating a generator object. Wouldn't you agree?

Was it helpful?

Solution

You are passing in a generator expression.

A list comprehension is specified with square brackets ([...]). A list comprehension builds a list object first, so it uses syntax closely related to the list literal syntax:

list_literal = [1, 2, 3]
list_comprehension = [i for i in range(4) if i > 0]

A generator expression, on the other hand, creates an iterator object. Only when iterating over that object is the contained loop executed and are items produced. The generator expression does not retain those items; there is no list object being built.

A generator expression always uses (...) round parethesis, but when used as the only argument to a call, the parenthesis can be omitted; the following two expressions are equivalent:

sum((i*i for i in xrange(5)))  # with parenthesis
sum(i*i for i in xrange(5))    # without parenthesis around the generator

Quoting from the generator expression documentation:

The parentheses can be omitted on calls with only one argument. See section Calls for the detail.

OTHER TIPS

List comprehensions are enclosed in []:

>>> [i*i for i in xrange(5)]  # list comprehension
[0, 1, 4, 9, 16]
>>> (i*i for i in xrange(5))  # generator
<generator object <genexpr> at 0x2cee40>

You are passing a generator.

That is a generator:

>>> (i*i for i in xrange(5))
<generator object <genexpr> at 0x01A27A08>
>>>

List comprehensions are enclosed in [].

You might also be asking, "does this syntax truly cause sum to consume a generator one item at a time, or does it secretly create a list of every item in the generator first"? One way to check this is to try it on a very large range and watch memory usage:

sum(i for i in xrange(int(1e8)))

Memory usage for this case is constant, where as range(int(1e8)) creates the full list and consumes several hundred MB of RAM.

You can test that the parentheses are optional:

def print_it(obj):
    print obj

print_it(i for i in xrange(5))
# prints <generator object <genexpr> at 0x03853C60>

I tried this:

#!/usr/bin/env python

    class myclass:

            def __init__(self,arg):
                    self.p = arg
                    print type(self.p)
                    print self.p





    if __name__ == '__main__':

            c = myclass(i*i for i in xrange(5))

And this prints:

$ ./genexprorlistcomp.py 
<type 'generator'>
<generator object <genexpr> at 0x7f5344c7cf00>

Which is consistent with what Martin and mdscruggs explained in their post.

You are passing a generator object, list comprehension is surrounded by [].

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top