pyOpenCL and 2D FFT

https://stackoverflow.com/questions/22818411

26-06-2023
|

Question

I am using pyFFT to fourier-transform a 2D array using and then continue with another OpenCL program (here double it as an example):

gpu_data = cl_array.to_device(queue, tData2D)  
plan.execute(gpu_data.data)  
eData2D = gpu_data.get()  


ctx = cl.Context([cl.get_platforms()[0].get_devices()[0]])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
eData2D_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=eData2D)
eData2D_dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, eData2D.nbytes)
prg = cl.Program(ctx, """
        //#define PYOPENCL_DEFINE_CDOUBLE     // uncomment for double support.
        #include "pyopencl-complex.h"    
        __kernel void sum(const unsigned int ySize,
                              __global cfloat_t *a,
                              __global cfloat_t *b)
        {
          int gid0 = get_global_id(0);
          int gid1 = get_global_id(1);

          b[gid1 + ySize*gid0] = a[gid1 + ySize*gid0]+a[gid1 + ySize*gid0];
        }
        """).build()

prg.sum(queue, eData2D.shape, None, np.int32(Ny), eData2D_buf, eData2D_dest_buf)
cl.enqueue_copy(queue, eData2Dresult, eData2D_dest_buf)

This works perfectly fine. Now, instead of retrieving the data and recopy it in a buffer eData2D = gpu_data.get() and copy it right back to the GPU memory eData2D_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=eData2D), I would like to keep using it.

I was expecting somethin like this:

gpu_data = cl_array.to_device(queue, tData2D)  
plan.execute(gpu_data.data)


ctx = cl.Context([cl.get_platforms()[0].get_devices()[0]])
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
eData2D_dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, eData2D.nbytes)
prg = cl.Program(ctx, """
        //#define PYOPENCL_DEFINE_CDOUBLE     // uncomment for double support.
        #include "pyopencl-complex.h"    
        __kernel void sum(const unsigned int ySize,
                              __global cfloat_t *a,
                              __global cfloat_t *b)
        {
          int gid0 = get_global_id(0);
          int gid1 = get_global_id(1);

          b[gid1 + ySize*gid0] = a[gid1 + ySize*gid0]+a[gid1 + ySize*gid0];
        }
        """).build()

prg.sum(queue, eData2D.shape, None, np.int32(Ny), gpu_data.data, eData2D_dest_buf)
cl.enqueue_copy(queue, eData2Dresult, eData2D_dest_buf)

Which did not work. Is there a way to do that? Thanks in advance for your help.

Solution

It looks like you are creating a whole new context to run the second program:

ctx = cl.Context(...)

An OpenCL buffer is only valid for the context in which it was created, so you need to make sure you use the same context for both OpenCL programs/kernels if you want to re-use the buffer like this.

You could also re-use the command-queue, rather than creating a new one.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow