Frage

I have a cpp file where I am creating an image and store the data to myOutput pointer:

int Rows = 80;
int Cols = 64;

for (int i = 0; i < Rows; i++ ){

   for (int j = 0; j < Cols; j++ )
    {

X = 1.0f * ((float) i - (float) Rows / 2) / (float) Rows;
Y = 2.0f * ((float) j - (float) Cols / 2) / (float) Cols;
.....
myOutput->Re = cosf( ......);
myOutput->Im = sinf(.......);

++myOutput;

    }
}

Then , in cuda I am reading like:

int bx = blockIdx.x , by = blockIdx.y;
int tx = threadIdx.x , ty = threadIdx.y;

int RowIdx = ty + by * TILE_WIDTH;
int ColIdx = tx + bx * TILE_WIDTH;


Index = RowIdx * Cols + ColIdx;

//copy input data to shared memory
myshared[ty+1][tx+1] = *( devInputArray + Index );

(So , the myOutput generated from cpp is loaded in devInputArray).

Now , I want to process many images simultaneously.

So, in cpp ,the following additions must be made (for 2 images for example) :

int ImagesNb = 2;

for ( ImagesIdx = 0; ImagesIdx < ImagesNb; ImagesIdx++ ){
   for (int i = 0; i < Rows; i++ ){

       for (int j = 0; j < Cols; j++ )
        {

 X = (ImagesIdx + 1) * 1.0f * ((float) i - (float) Rows / 2) / (float) Rows;
 Y = (ImagesIdx + 1) * 2.0f * ((float) j - (float) Cols / 2) / (float) Cols;
...

But , now I am not sure how to read the data from cuda.

I don't know how to take into account the number of images.

Before , I had a pointer which contained data (80 x 64) .

Now , it still contains the same dimension of every image but with more data.

I must change this:

Index = RowIdx * Cols + ColIdx;

//copy input data to shared memory
myshared[ty+1][tx+1] = *( devInputArray + Index );

but I can't figure how!

I hope it is clear!

UPDATED

I am trying something like this:

 int bx = blockIdx.x , by = blockIdx.y ,  bz = blockIdx.z;
 int tx = threadIdx.x , ty = threadIdx.y , tz = threadIdx.z;

 int RowIdx = ty + by * TILE_WIDTH;
 int ColIdx = tx + bx * TILE_WIDTH;
 int ImagesIdx = tz + bz * blockDim.z;

 Index = RowIdx * Cols + ColIdx + Rows * Cols * ImagesIdx

and :

dim3 dimGrid( ImagesNb * (Cols / TILE_WIDTH)  , ImagesNb * (Rows / TILE_WIDTH) , ImagesNb);
dim3 dimBlock( TILE_WIDTH , TILE_WIDTH , 2);

but if I try for 2 images I am not getting right results..

War es hilfreich?

Lösung

Ok, for using a number of images you must add an extra dimension to shared variable in order to hold the number of images.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top