Question

I'm sure this has been answered somewhere but I wasn't sure how to describe it.

Let's say I want to create a list containing 3 empty lists, like so:

lst = [[], [], []]

I thought I was being all clever by doing this:

lst = [[]] * 3

But I discovered, after debugging some weird behavior, that this caused an append update to one sublist, say lst[0].append(3), to update the entire list, making it [[3], [3], [3]] rather than [[3], [], []].

However, if I initialize the list with

lst = [[] for i in range(3)]

then doing lst[1].append(5)gives the expected [[], [5], []]

My question is why does this happen? It is interesting to note that if I do

lst = [[]]*3
lst[0] = [5]
lst[0].append(3)

then the 'linkage' of cell 0 is broken and I get [[5,3],[],[]], but lst[1].append(0) still causes [[5,3],[0],[0].

My best guess is that using multiplication in the form [[]]*x causes Python to store a reference to a single cell...?

Was it helpful?

Solution

My best guess is that using multiplication in the form [[]] * x causes Python to store a reference to a single cell...?

Yes. And you can test this yourself

>>> lst = [[]] * 3
>>> print [id(x) for x in lst]
[11124864, 11124864, 11124864]

This shows that all three references refer to the same object. And note that it really makes perfect sense that this happens1. It just copies the values, and in this case, the values are references. And that's why you see the same reference repeated three times.

It is interesting to note that if I do

lst = [[]]*3
lst[0] = [5]
lst[0].append(3)

then the 'linkage' of cell 0 is broken and I get [[5,3],[],[]], but lst[1].append(0) still causes [[5,3],[0],[0].

You changed the reference that occupies lst[0]; that is, you assigned a new value to lst[0]. But you didn't change the value of the other elements, they still refer to the same object that they referred to. And lst[1] and lst[2] still refer to exactly the same instance, so of course appending an item to lst[1] causes lst[2] to also see that change.

This is a classic mistake people make with pointers and references. Here's the simple analogy. You have a piece of paper. On it, you write the address of someone's house. You now take that piece of paper, and photocopy it twice so you end up with three pieces of paper with the same address written on them. Now, take the first piece of paper, scribble out the address written on it, and write a new address to someone else's house. Did the address written on the other two pieces of paper change? No. That's exactly what your code did, though. That's why the other two items don't change. Further, imagine that the owner of the house with address that is still on the second piece of paper builds an add-on garage to their house. Now I ask you, does the house whose address is on the third piece of paper have an add-on garage? Yes, it does, because it's exactly the same house as the one whose address is written on the second piece of paper. This explains everything about your second code example.

1: You didn't expect Python to invoke a "copy constructor" did you? Puke.

OTHER TIPS

This is because sequence multiplication merely repeats the references. When you write [[]] * 2, you create a new list with two elements, but both of these elements are the same object in memory, namely an empty list. Hence, a change in one is reflected in the other. The comprehension, by contrast, creates a new, independent list on each iteration:

>>> l1 = [[]] * 2
>>> l2 = [[] for _ in xrange(2)]
>>> l1[0] is l1[1]
True
>>> l2[0] is l2[1]
False

They are referencing the same lists.

There are similar questions here and here

And from the FAQ:

" * doesn’t create copies, it only creates references to the existing objects."

Your guess that using multiplication in the form [[]] * x causes Python to store a reference to a single cell is correct.

So you end up with a list of 3 references to the same list.

Basically what is happening in your first example is that a list is being created with multiple references to the same inner list. Here's a breakdown.

>>> a = []
>>> b = [a]
>>> c = b * 3  # c now contains three references to a
>>> d = [ a for _ in xrange(4) ]  # and d contains four references to a
>>> print c
[[], [], []]
>>> print d
[[], [], [], []]
>>> a.append(3)
>>> print c
[[3], [3], [3]]
>>> print d
[[3], [3], [3], [3]]
>>> x = [[]] * 3  # shorthand equivalent to c
>>> print x
[[], [], []]
>>> x[0].append(3)
>>> print x
[[3], [3], [3]]

The above is equivalent to your first example. Now that each list is given its own variable, hopefully it is clearer why. c[0] is c[1] will evaluate as True, because both expressions evaluate to the same object (a).

Your second example creates multiple different inner list objects.

>>> c = [[], [], []]  # this line creates four different lists
>>> d = [ [] for _ in xrange(3) ]  # so does this line
>>> c[0].append(4)
>>> d[0].append(5)
>>> print c
[[4], [], []]
>>> print d
[[5], [], []]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top