Speed up a delta filter in python/numpy
Question
I am writing a decompressor which (among other things) has to apply a delta filter to RGB images. That is, read images where only the first pixel is absolute (R1, G1, B1) and all the others are in the form (R[n]-R[n-1], G[n]-G[n-1], B[n]-B[n-1]), and convert them to standard RGB.
Right now I am using numpy as follows:
rgb = numpy.fromstring(data, 'uint8')
components = rgb.reshape(3, -1, order='F')
filtered = numpy.cumsum(components, dtype='uint8', axis=1)
frame = numpy.reshape(filtered, -1, order='F')
Where
- line 1 creates a 1D array of the original image;
line 2 reshapes it in the form
[[R1, R2, ..., Rn], [G1, G2, ..., Gn], [B1, B2, ..., Bn]]
line 3 performs the actual defiltering
- line 4 converts back again to a 1D array
The problem is that it is too slow for my needs. I profiled it and found out that a good amount of time is spent reshaping the array.
So I wonder: is there some way of avoiding reshaping or to speed it up?
Notes:
- I'd prefer not to have to write a C extension for this.
- I'm already using multithreading
Solution
First, when you read it in you can tell it a little more about the type, Try:
rgb = numpy.fromstring(data, '3uint8')
No reshape needed.
Next, for large operations, where you can get away with it (and cumsum
qualifies), use the out=
param to keep from moving the data...everything happens in place. Use:
rgb.cumsum(axis=0,out=rgb)
if you still want it flattened:
rgb = rgb.ravel()
OTHER TIPS
For some reason I did not understand yet, the final reshape in your code copies the data. This can be avoided by using C order instead of Fortran order:
rgb = numpy.fromstring(data, 'uint8')
components = rgb.reshape(-1, 3)
filtered = numpy.cumsum(components, dtype='uint8', axis=0)
frame = filtered.reshape(-1)