Python: Change format of np.array or allow tolerance in in1d function

https://stackoverflow.com/questions/23520952

17-07-2023
|

Question

I have two numpy arrays (data files loaded with np.loadtxt). They do not have the same length (or number of rows if you will).

I want to create a mask, where I find the values in the smaller array in the larger array. For that I can use np.in1d. However, the precision on the larger array is larger as well. My problem is illustrated in the following example

a = np.array([1.011, 2.000, 3.001])
b = np.array([1.01, 3.00])
mask = np.in1d(a, b)

c
array([False, False, False], dtype=bool)

What I want is c to be

c
array([True, False, True], dtype=bool)

So is there a way to either allow np.in1d to allow a tolerance (tol=0.01) or change the precision on array a? I am also open to other solutions of cause.

Solution

You could do it as shown below. If a and b get large, this is going to need lots of memory (on the order of the product of the sizes of a and b). Maybe you could loop over into small-enough chunks of b if that's a problem.

import numpy as np

def in1d_tol(a,b,tol):
    d=np.abs(a-b[:,np.newaxis])
    return np.any(d<=tol, axis=0)

a = np.array([1.011, 2.000, 3.001])
b = np.array([1.01, 3.00])

c = in1d_tol(a,b,0.01)

print c

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow