Question

I need to minimize the total amount of flops in the following code, can anyone please take a quick look and tell me where to put my effort? I've tried several perfomance analyzers, but the results were irrelevant..

int twoDToOneD(int i, int j, int nRows)
{
return j*nRows + i;
}

double* addMatrices(int m, int n, double* A, double* B, bool add)
{
double* C = new double[m*n];
double* pA = A;
double* pB = B;
double* pC = C;

int i = m*n;

while(i--)
{
    if(add)
    {
        *pC = *pA + *pB;
    } else 
    {
        *pC = *pA - *pB;
    }

    pC++;
    pA++;
    pB++;
}

return C;
}

Thanks, Cho

Was it helpful?

Solution

Right, I didn't read every last line, but it looks like you are simply:

  1. generating random ints
  2. storing them as doubles
  3. adding them
  4. adding and multiplying them

IOW - I don't see any use of the fraction at all, and an appropriately sized integer will be superior. If that's true, you can remove every FLOP in the program ;)

If that's not an accurate signal, you can still use it in an appropriate range, then just scale the signal up, store as an integer, and then bring the result back down to the proper range in some cases.

For appropriately sized mul/add, using integers will be much faster and potentially require much less memory -- you can also apply SIMD instructions to them.

Beyond that - your improving cache locality, minimizing branching, and minimizing dynamic allocations could also make the program a few times faster.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top