Question

What is the efficient(probably vectorized with Matlab terminology) way to generate random number of zeros and ones with a specific proportion? Specially with Numpy?

As my case is special for 1/3, my code is:

import numpy as np 
a=np.mod(np.multiply(np.random.randomintegers(0,2,size)),3)

But is there any built-in function that could handle this more effeciently at least for the situation of K/N where K and N are natural numbers?

Was it helpful?

Solution 3

If I understand your problem correctly, you might get some help with numpy.random.shuffle

>>> def rand_bin_array(K, N):
    arr = np.zeros(N)
    arr[:K]  = 1
    np.random.shuffle(arr)
    return arr

>>> rand_bin_array(5,15)
array([ 0.,  1.,  0.,  1.,  1.,  1.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,
        0.,  0.])

OTHER TIPS

Yet another approach, using np.random.choice:

>>> np.random.choice([0, 1], size=(10,), p=[1./3, 2./3])
array([0, 1, 1, 1, 1, 0, 0, 0, 0, 0])

A simple way to do this would be to first generate an ndarray with the proportion of zeros and ones you want:

>>> import numpy as np
>>> N = 100
>>> K = 30 # K zeros, N-K ones
>>> arr = np.array([0] * K + [1] * (N-K))
>>> arr
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1])

Then you can just shuffle the array, making the distribution random:

>>> np.random.shuffle(arr)
>>> arr
array([1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0,
       1, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1,
       1, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1,
       0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1,
       1, 1, 1, 0, 1, 1, 1, 1])

Note that this approach will give you the exact proportion of zeros/ones you request, unlike say the binomial approach. If you don't need the exact proportion, then the binomial approach will work just fine.

You can use numpy.random.binomial. E.g. suppose frac is the proportion of ones:

In [50]: frac = 0.15

In [51]: sample = np.random.binomial(1, frac, size=10000)

In [52]: sample.sum()
Out[52]: 1567

Another way of getting the exact number of ones and zeroes is to sample indices without replacement using np.random.choice:

arr_len = 30
num_ones = 8

arr = np.zeros(arr_len, dtype=int)
idx = np.random.choice(range(arr_len), num_ones, replace=False)
arr[idx] = 1

Out:

arr

array([0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1,
       0, 0, 0, 0, 0, 1, 0, 0])

Simple one-liner: you can avoid using lists of integers and probability distributions, which are unintuitive and overkill for this problem in my opinion, by simply working with bools first and then casting to int if necessary (though leaving it as a bool array should work in most cases).

>>> import numpy as np
>>> np.random.random(9) < 1/3.
array([False,  True,  True,  True,  True, False, False, False, False])   
>>> (np.random.random(9) < 1/3.).astype(int)
array([0, 0, 0, 0, 0, 1, 0, 0, 1])    
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top