numpy random.choice elements that are not selected

https://stackoverflow.com/questions/22876567

28-06-2023
|

Question

I have an array A as below:

import numpy as np
A = np.random.sample(100)

I want to create 2 random subsets from A, that if I combine them together I will get A

inx = np.random.choice(np.arange(100), size=70, replace=False)
S1 = A[inx]

So, S1 is one of the subsets, now how can I construct S2 to contain all the elements in A that are not in S1; in other words S2=A-S1.

Solution

Set operations may help:

S2 = A[list(set(range(100)) - set(inx))]

But you may need to sort:

S2 = A[ sorted(list(set(range(100)) - set(inx))) ]

OTHER TIPS

(Minor: if A can have duplicate elements, choosing the complement of the indices and having S2 contain all the elements in A not in S1 aren't the same thing.)

I might bypass indices entirely, instead permuting the elements and then splitting the results:

>>> A = np.random.sample(10)
>>> S1, S2 = np.split(np.random.permutation(A), [7])
>>> S1
array([ 0.97128145,  0.5617039 ,  0.42625808,  0.39108218,  0.52366291,
        0.73606525,  0.5279909 ])
>>> S2
array([ 0.45652426,  0.38622805,  0.99084781])

but there's also np.setdiff1d, so if you already have S1:

>>> S2 = np.setdiff1d(A, S1)
>>> S2
array([ 0.38622805,  0.45652426,  0.99084781])

I think this code is equivalent to what you are trying to do.

A = np.random.sample(100)
T = A[:]
np.random.shuffle(T)

size = 70
S1 = T[:size]
S2 = T[size:]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow