How to bin a matrix

Question 1

If you're applying this to an array that has many rows this function will give you some speed up at the cost of some temporary memory.

def hist_per_row(data, bins):

    data = np.asarray(data)

    assert np.all(bins[:-1] <= bins[1:])
    r, c = data.shape
    idx = bins.searchsorted(data)
    step = len(bins) + 1
    last = step * r
    idx += np.arange(0, last, step).reshape((r, 1))
    res = np.bincount(idx.ravel(), minlength=last)
    res = res.reshape((r, step))
    return res[:, 1:-1]

The res[:, 1:-1] on the last line is to be consistent with numpy.histogram which returns an array with len len(bins) - 1, but you could drop it if you want to count values that are less than and greater than bins[0] and bins[-1] respectively.

Question 2

Thank you everybody for your answers and comments. Finally, I found a way to speed up the binning procedure. Instead of using np.searchsorted(data), I am doing np.array(data*nbins, dtype=int). Substituting this line in the code posted by Bi Rico, I found that it becomes a factor 3 faster. Here below I post the function by Bi Rico with my modification, so that other user can easily take it.

def hist_per_row(data, bins):

    data = np.asarray(data)
    assert np.all(bins[:-1] <= bins[1:])
    r, c = data.shape

    nbins = len(bins)-1
    data = data/bins[-1]
    idx = array(data*nbins, dtype=int)+1

    step = len(bins) + 1
    last = step * r
    idx += np.arange(0, last, step).reshape((r, 1))
    res = np.bincount(idx.ravel(), minlength=last)
    res = res.reshape((r, step))
    return res[:, 1:-1]

Question 3

something along these lines?

import numpy as np
data = np.random.rand(10,20)
print np.apply_along_axis(lambda x: np.histogram(x)[0], 1, data)