Question

I got the following piece of code:

import numpy as np
rand_draw1 = np.random.rand(5,4)
rand_draw2 = rand_draw1
rand_draw2[0:2,0:4] = np.random.rand(2,4)

My intention is to have the variables rand_draw1 and rand_draw2 to be identical except for the first two rows. However they turn out to be completely identical, even the first two lines.

Initially I thought this question answers my problem: Random Number Generation - Same Number returned which suggested that it must be due to the fact that these random draws are based on the machine clock and because the commands are executed in virtually the same instance, one draws the same numbers. But if that is the case, why do I get the same result running this in the terminal (i.e. typing the lines one-by-one)?

To summarize I have two questions: - How do I fix my code? - Is it indeed the machine-time 'problem' that causes this?

Thanks in advance!

Was it helpful?

Solution

Just assigning rand_draw2 = rand_draw1 does not create a copy, it simply binds the name rand_draw2 to the same object already bound to rand_draw1:

>>> rand_draw2 = rand_draw1
>>> rand_draw2 is rand_draw1
True

Instead, you need to explicitly copy rand_draw1, and assign the copy to rand_draw2:

>>> rand_draw1 = np.random.rand(5, 4)
>>> rand_draw2 = rand_draw1.copy()
>>> rand_draw2[0:2] = np.random.rand(2, 4)
>>> rand_draw1
array([[ 0.08254004,  0.51848814,  0.69348487,  0.44053008],
       [ 0.75273107,  0.64677024,  0.78397813,  0.12768647],
       [ 0.37552669,  0.8365069 ,  0.44490398,  0.3943413 ],
       [ 0.27263619,  0.40379047,  0.43227555,  0.61552473],
       [ 0.55214161,  0.21380748,  0.34122889,  0.44029075]])
>>> rand_draw2
array([[ 0.26229975,  0.02754367,  0.7989174 ,  0.94619982],
       [ 0.40869498,  0.01327566,  0.06437938,  0.94647506],
       [ 0.37552669,  0.8365069 ,  0.44490398,  0.3943413 ],
       [ 0.27263619,  0.40379047,  0.43227555,  0.61552473],
       [ 0.55214161,  0.21380748,  0.34122889,  0.44029075]])

See e.g. here for a good explanation of how names in Python work.

OTHER TIPS

In python, assignment does not create a copy of the object, and so, both the labels rand_data1 and rand_data2 refer to same object currently. Thus, when you edit the second variable (rand_data2), the first also seems to automatically get updated.

>>> import numpy as np
>>> rand_draw1 = np.random.rand(5,4)
>>> rand_draw2 = rand_draw1
>>> print id(rand_draw2), id(rand_draw2)
40407360 40407360

To create a copy, use .copy() method. Note that depending on requirements, you may need to do a deepcopy instead.

>>> rand_draw2 = rand_draw1.copy()
>>> id(rand_draw2)
41090720

And now you can do

>>> rand_draw2[0:2,0:4] = np.random.rand(2,4)
>>> print rand_draw1
[[ 0.46171859  0.6766379   0.97746539  0.15278117]
 [ 0.93963979  0.19853993  0.29979121  0.10237192]
 [ 0.15283647  0.21643831  0.21335029  0.42910395]
 [ 0.92836103  0.03468904  0.40524073  0.90284648]
 [ 0.05225297  0.83740986  0.43472966  0.08430102]]
>>> print rand_draw2
[[ 0.37539354  0.71703056  0.76480003  0.95918987]
 [ 0.15026104  0.04198227  0.58959412  0.45517846]
 [ 0.15283647  0.21643831  0.21335029  0.42910395]
 [ 0.92836103  0.03468904  0.40524073  0.90284648]
 [ 0.05225297  0.83740986  0.43472966  0.08430102]]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top