how to enable gcc to vectorize this loop

https://stackoverflow.com/questions/22558916

18-06-2023
|

Question

I have this loop where b2 is a float, x1 is a (Eigen c++) vector of float, a1 and a0 are int.

for(int i=1;i<9;i++)
    b2+=a0*(float)0.5*(std::log(fabs(x1(a1+a0*(i-1))))+std::log(fabs(x1(a1+a0*i))));

GCC returns:

analyze_innermost: failed: evolution of base is not affine.

I was wondering if there was a simple way to rewrite the loop to allow GCC to vectorize it (I'm compiling with all the unsafe options enabled...I'm doing this to learn).

Edit:

x1 is an eigen construct. I'm using GCC 4.8.1 with O3 flag.

Solution

I would break this up into 3 loops:

float t1[9];
float t2[9];

for (i = 0; i < 9; ++i)                // (1) - gather input terms
    t1[i] = x1(a1+a0*i);

for (i = 0; i < 9; ++i)                // (2) - do expensive log/fabs operations
    t2[i] = std::log(fabs(t1[i]));     //       with minimum redundancy

for (i = 1; i < 9; ++i)                // (3) - wrap it all up
    b2 += a0*0.5f*(t2[i-1] + t2[i]);

I suspect that (1) may not be vectorizable (unless you have AVX2 with gathered loads), but (2) and (3) have a reasonable chance.

OTHER TIPS

Your example cannot be easily vectorized because you're not accessing the entries of x1 in a sequential manner.

With sequential access, it could be vectorized like that:

ArrayXf x1;
b2 = (x1.segment(i,9).abs().log() + x1.segment(j,9).abs().log()).sum() * a0;

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow