Question

I just read the answer to this question: Accessing class variables from a list comprehension in the class definition

It helps me to understand why the following code results in NameError: name 'x' is not defined:

class A:
    x = 1
    data = [0, 1, 2, 3]
    new_data = [i + x for i in data]
    print(new_data)

The NameError occurs because x is not defined in the special scope for list comprehension. But I am unable to understand why the following code works without any error.

class A:
    x = 1
    data = [0, 1, 2, 3]
    new_data = [i for i in data]
    print(new_data)

I get the output [0, 1, 2, 3]. But I was expecting this error: NameError: name 'data' is not defined because I was expecting just like in the previous example the name x is not defined in the list comprehension's scope, similarly, the name data would not be defined too in the list comprehension's scope.

Could you please help me to understand why x is not defined in the list comprehension's scope but data is?

Was it helpful?

Solution

data is the source of the list comprehension; it is the one parameter that is passed to the nested scope created.

Everything in the list comprehension is run in a separate scope (as a function, basically), except for the iterable used for the left-most for loop. You can see this in the byte code:

>>> def foo():
...     return [i for i in data]
... 
>>> dis.dis(foo)
  2           0 LOAD_CONST               1 (<code object <listcomp> at 0x105390390, file "<stdin>", line 2>)
              3 LOAD_CONST               2 ('foo.<locals>.<listcomp>')
              6 MAKE_FUNCTION            0
              9 LOAD_GLOBAL              0 (data)
             12 GET_ITER
             13 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             16 RETURN_VALUE

The <listcomp> code object is called like a function, and iter(data) is passed in as the argument (CALL_FUNCTION is executed with 1 positional argument, the GET_ITER result).

The <listcomp> code object looks for that one argument:

>>> dis.dis(foo.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              3 LOAD_FAST                0 (.0)
        >>    6 FOR_ITER                12 (to 21)
              9 STORE_FAST               1 (i)
             12 LOAD_FAST                1 (i)
             15 LIST_APPEND              2
             18 JUMP_ABSOLUTE            6
        >>   21 RETURN_VALUE

The LOAD_FAST call refers to the first and only positional argument passed in; it is unnamed here because there never was a function definition to give it a name.

Any additional names used in the list comprehension (or set or dict comprehension, or generator expression, for that matter) are either locals, closures or globals, not parameters.

If you go back to my answer to that question, look for the section titled The (small) exception; or, why one part may still work; I tried to cover this specific point there:

There's one part of a comprehension or generator expression that executes in the surrounding scope, regardless of Python version. That would be the expression for the outermost iterable.

OTHER TIPS

The dis.dis answer is interesting but it does not actually explain why that happens. Here it is, from a similar error:

If a name binding operation occurs anywhere within a code block, all uses of the name within the block are treated as references to the current block. This can lead to errors when a name is used within a block before it is bound. This rule is subtle. Python lacks declarations and allows name-binding operations to occur anywhere within a code block. The local variables of a code block can be determined by scanning the entire text of the block for name binding operations.

So in simple terms: data cannot refer to x because the block is not bound by that point. There's no way to refer to x: neither by x alone or A.x.

Source: python docs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top