FFTW advanced layout -- inembed=n and inembed=NULL give different results?

https://stackoverflow.com/questions/16905944

30-05-2022
|

Question

I'm working with batched 2D FFTs using the FFTW advanced data layout API.

According to the FFTW Advanced Complex DFT documentation:

Passing NULL for an nembed parameter is equivalent to passing n.

However, I'm getting different results when using inembed = onembed = NULL vs. inembed = onembed = n. What could be causing the results not to match?

Let's do an example...

Setup

int howMany = 2;
int nRows = 4;
int nCols = 4;
int n[2] = {nRows, nCols};
float* h_in = (float*)malloc(sizeof(float) * nRows*nCols*howMany);
for(int i=0; i<(nRows*nCols*howMany); i++){ //initialize h_in to [0 1 2 3 4 ...]
    h_in[i] = (float)i;
    printf("h_in[%d] = %f \n", i, h_in[i]);
}

FFTW Plan using inembed == onembed == NULL

fftwf_plan forwardPlan = fftwf_plan_many_dft_r2c(2, //rank
                            n, //dimensions = {nRows, nCols}
                            howMany, //howmany
                            h_in, //in
                            NULL, //inembed
                            howMany, //istride
                            1, //idist
                            h_freq, //out
                            NULL, //onembed
                            howMany, //ostride
                            1, //odist
                            FFTW_PATIENT /*flags*/);

I also ran a version of this with inembed = onembed = n = {nRows, nCols}.

Results

Notice that using NULL or n gives the same numerical results, but in a different order in memory:

Version 1: inembed == onembed == NULL

result[0][0,1] = 240, 0 
result[1][0,1] = 256, 0 
result[2][0,1] = -16, 16 
result[3][0,1] = -16, 16 
result[4][0,1] = -16, 0 
result[5][0,1] = -16, 0  //this line and above match the other version
result[6][0,1] = -64, 64  //this line and below don't match (data is in a different order)
result[7][0,1] = -64, 64  
result[8][0,1] = 0, 0 
result[9][0,1] = 0, 0 
result[10][0,1] = 0, 0 
result[11][0,1] = 0, 0 
result[12][0,1] = -64, 0 
result[13][0,1] = -64, 0 
result[14][0,1] = 0, 0 
result[15][0,1] = 0, 0 
result[16][0,1] = 0, 0 
result[17][0,1] = 0, 0 
result[18][0,1] = -64, -64 
result[19][0,1] = -64, -64 
result[20][0,1] = 0, 0 
result[21][0,1] = 0, 0 
result[22][0,1] = 0, 0 
result[23][0,1] = 0, 0 
result[24][0,1] = 0, 0 
result[25][0,1] = 0, 0 
result[26][0,1] = 0, 0 
result[27][0,1] = 0, 0 
result[28][0,1] = 0, 0 
result[29][0,1] = 0, 0 
result[30][0,1] = 0, 0 
result[31][0,1] = 0, 0

Version 2: inembed = onembed = n = {nRows, nCols}

result[0][0,1] = 240, 0 
result[1][0,1] = 256, 0 
result[2][0,1] = -16, 16 
result[3][0,1] = -16, 16 
result[4][0,1] = -16, 0 
result[5][0,1] = -16, 0 
result[6][0,1] = 0, 0  
result[7][0,1] = 0, 0  
result[8][0,1] = -64, 64 
result[9][0,1] = -64, 64 
result[10][0,1] = 0, 0 
result[11][0,1] = 0, 0 
result[12][0,1] = 0, 0 
result[13][0,1] = 0, 0 
result[14][0,1] = 0, 0 
result[15][0,1] = 0, 0 
result[16][0,1] = -64, 0 
result[17][0,1] = -64, 0 
result[18][0,1] = 0, 0 
result[19][0,1] = 0, 0 
result[20][0,1] = 0, 0 
result[21][0,1] = 0, 0 
result[22][0,1] = 0, 0 
result[23][0,1] = 0, 0 
result[24][0,1] = -64, -64 
result[25][0,1] = -64, -64 
result[26][0,1] = 0, 0 
result[27][0,1] = 0, 0 
result[28][0,1] = 0, 0 
result[29][0,1] = 0, 0 
result[30][0,1] = 0, 0 
result[31][0,1] = 0, 0

Here's a working implementation of this experiment.

Solution

Solution:
The out-of-place example with embed != NULL in the above example is solved by setting inembed = {nRows, nCols} and onembed = {nRows, (nCols/2 + 1)}.

Details:

I resolved this after very carefully reading the FFTW documentation and getting some help from Matteo Frigo. You can retrace my steps here:

According to 4.4.2 Advanced Real-data DFTs in the FFTW manual: If an nembed parameter is NULL, it is interpreted as what it would be in the basic interface.

Let's assume our input real data is of dimension nx * ny. For the FFTW basic interface, 2.4 Multi-Dimensional DFTs of Real Data explains the following inembed and onembed conventions for 2D real-to-complex FFTs:

if out-of-place:
    inembed = [ny, nx]
    onembed = [ny, (nx/2 + 1)]

if in-place:
    inembed = [ny, 2(nx/2 + 1)]
    onembed = [ny, (nx/2 + 1)]

So, when we use the simple FFTW r2c interface or use the advanced interface with embed=NULL, FFTW defaults to the above embed parameters. We can reproduce the numerical results from embed=NULL by using the above embed parameters.

It turns out that the statement Passing NULL for an nembed parameter is equivalent to passing n comes from the FFTW complex-to-complex manual page. But, we're doing real-to-complex transforms in the examples above. Real-to-complex transforms have a different convention than complex-to-complex transforms for inembed and onembed.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow