PyOpenCL reduction Kernel on each pixel of image as array instead of each byte (RGB mode, 24 bits )

https://stackoverflow.com/questions/23453519

15-07-2023
|

Question

I'm trying to calculate the average Luminance of an RGB image. To do this, I find the luminance of each pixel i.e.

L(r,g,b) = X*r + Y*g + Z*b (some linear combination).

And then find the average by summing up luminance of all pixels and dividing by width*height. To speed this up, I'm using pyopencl.reduction.ReductionKernel

The array I pass to it is a Single Dimension Numpy Array so it works just like the example given.

import Image
import numpy as np
im = Image.open('image_00000001.bmp')
data = np.asarray(im).reshape(-1) # so data is a single dimension list
# data.dtype is uint8, data.shape is (w*h*3, )

I want to incorporate the following code from the example into it . i.e. I would make changes to datatype and the type of arrays I'm passing. This is the example:

a = pyopencl.array.arange(queue, 400, dtype=numpy.float32)
b = pyopencl.array.arange(queue, 400, dtype=numpy.float32)

krnl = ReductionKernel(ctx, numpy.float32, neutral="0",
        reduce_expr="a+b", map_expr="x[i]*y[i]",
        arguments="__global float *x, __global float *y")

my_dot_prod = krnl(a, b).get()

Except, my map_expr will work on each pixel and convert each pixel to its luminance value. And reduce expr remains the same.

The problem is, it works on each element in the array, and I need it to work on each pixel which is 3 consecutive elements at a time (RGB ).

One solution is to have three different arrays, one for R, one for G and one for B ,which would work, but is there another way ?

Solution

Edit: I changed the program to illustrate the char4 usage instead of float4:

import numpy as np
import pyopencl as cl
import pyopencl.array as cl_array


deviceID = 0
platformID = 0
workGroup=(1,1)

N = 10
testData = np.zeros(N, dtype=cl_array.vec.char4)

dev = cl.get_platforms()[platformID].get_devices()[deviceID]

ctx = cl.Context([dev])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
Data_In = cl.Buffer(ctx, mf.READ_WRITE, testData.nbytes)


prg = cl.Program(ctx, """

__kernel void   Pack_Cmplx( __global char4* Data_In, int  N)
{
  int gid = get_global_id(0);

  //Data_In[gid] = 1; // This would change all components to one
  Data_In[gid].x = 1;  // changing single component
  Data_In[gid].y = 2;
  Data_In[gid].z = 3;
  Data_In[gid].w = 4;
}
 """).build()

prg.Pack_Cmplx(queue, (N,1), workGroup, Data_In, np.int32(N))
cl.enqueue_copy(queue, testData, Data_In)
print testData

I hope it helps.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow