Question

Let me expose my issue :

I wrote a piece of software with Python and Numpy, it produces two numpy arrays named X and Y.

This values are related as a function : Y = f(X)

X values belong to the interval [0;1].

numpy.histogram allows for binning X values in predefined equally spaced bins in this interval.

What I would like to do is to sum the Y values corresponding to each bin without doing a "for" loop.

Thank you much for your answers.

Was it helpful?

Solution

Assuming that your y-values are at the corresponding position, i.e., y[i] = f(x[i]) then you can use numpy.digitize to find the indexes of the bins that the x-values belong to and use those indexes to sum up the corresponding y-values.

From the numpy example (ignore that the values are not within [0; 1]):

>>> x = np.array([0.2, 6.4, 3.0, 1.6])
>>> bins = np.array([0.0, 1.0, 2.5, 4.0, 10.0])
>>> inds = np.digitize(x, bins)
>>> inds
array([1, 4, 3, 2])

then sum up the values in y:

>>> aggregate = [y[inds == i].sum() for i in np.unique(inds)]

If you're struggling with creating the bins yourself, look at numpy.linspace.

numpy.linspace(0, 1, num=50, endpoint=True)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top