Ok, for using a number of images you must add an extra dimension to shared variable in order to hold the number of images.
Frage
I have a cpp file where I am creating an image and store the data to myOutput pointer:
int Rows = 80;
int Cols = 64;
for (int i = 0; i < Rows; i++ ){
for (int j = 0; j < Cols; j++ )
{
X = 1.0f * ((float) i - (float) Rows / 2) / (float) Rows;
Y = 2.0f * ((float) j - (float) Cols / 2) / (float) Cols;
.....
myOutput->Re = cosf( ......);
myOutput->Im = sinf(.......);
++myOutput;
}
}
Then , in cuda I am reading like:
int bx = blockIdx.x , by = blockIdx.y;
int tx = threadIdx.x , ty = threadIdx.y;
int RowIdx = ty + by * TILE_WIDTH;
int ColIdx = tx + bx * TILE_WIDTH;
Index = RowIdx * Cols + ColIdx;
//copy input data to shared memory
myshared[ty+1][tx+1] = *( devInputArray + Index );
(So , the myOutput generated from cpp is loaded in devInputArray).
Now , I want to process many images simultaneously.
So, in cpp ,the following additions must be made (for 2 images for example) :
int ImagesNb = 2;
for ( ImagesIdx = 0; ImagesIdx < ImagesNb; ImagesIdx++ ){
for (int i = 0; i < Rows; i++ ){
for (int j = 0; j < Cols; j++ )
{
X = (ImagesIdx + 1) * 1.0f * ((float) i - (float) Rows / 2) / (float) Rows;
Y = (ImagesIdx + 1) * 2.0f * ((float) j - (float) Cols / 2) / (float) Cols;
...
But , now I am not sure how to read the data from cuda.
I don't know how to take into account the number of images.
Before , I had a pointer which contained data (80 x 64) .
Now , it still contains the same dimension of every image but with more data.
I must change this:
Index = RowIdx * Cols + ColIdx;
//copy input data to shared memory
myshared[ty+1][tx+1] = *( devInputArray + Index );
but I can't figure how!
I hope it is clear!
UPDATED
I am trying something like this:
int bx = blockIdx.x , by = blockIdx.y , bz = blockIdx.z;
int tx = threadIdx.x , ty = threadIdx.y , tz = threadIdx.z;
int RowIdx = ty + by * TILE_WIDTH;
int ColIdx = tx + bx * TILE_WIDTH;
int ImagesIdx = tz + bz * blockDim.z;
Index = RowIdx * Cols + ColIdx + Rows * Cols * ImagesIdx
and :
dim3 dimGrid( ImagesNb * (Cols / TILE_WIDTH) , ImagesNb * (Rows / TILE_WIDTH) , ImagesNb);
dim3 dimBlock( TILE_WIDTH , TILE_WIDTH , 2);
but if I try for 2 images I am not getting right results..
Lösung