I've come up with a solution that is adequate enough for my application using Numpy masked arrays. In my application, the arr
list is not "too ragged" (i.e. the max length of any interior list is not extremely different from the min length of any interior list). Therefore, I start by padding arr
with -1
s, then create a mask based on the location of the -1
s. I perform my operation and use the mask on the resulting array. In this case, there are a few extra calculations being done unnecessarily (on the padded entries), but this is still faster that the Python loop (by a factor of almost 2). The example code is below:
import numpy as np
import numpy.ma as ma
x = np.array([1,2,3,4,5],dtype=np.double)
arr = [[1,2],[0,4,3],[1,4,0],[0,3,4],[1,4]]
max_arr_length = max([ len(item) for item in arr ])
arr_padded = [ np.pad(i,(0,max_arr_length-len(i)), mode='constant',
constant_values=-1) for i in arr ]
arr_masked = ma.masked_equal(arr_padded,-1)
ans_masked = ma.masked_array(x[arr_masked] - x[:, None], mask=arr_masked.mask)
This is a bit of a hack, but it works well enough for me. It would be nice if Numpy had support for ragged arrays.