GCC can't vectorize this simple loop ('number of iterations cannot be computed') yet managed a similar one in the same code?

StackOverflow https://stackoverflow.com/questions/22731165

  •  23-06-2023
  •  | 
  •  

So, I have C++ code with this loop:

for(i=0;i<(m-1);i++)    N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;

All the quantitiy involved are int's. From GCC's vectorization report I get:

babar.cpp:233: note: ===== analyze_loop_nest =====
babar.cpp:233: note: === vect_analyze_loop_form ===
babar.cpp:233: note: === get_loop_niters ===
babar.cpp:233: note: not vectorized: number of iterations cannot be computed.
babar.cpp:233: note: bad loop form.

I wondering why 'the number of iteration cannot be computed'!? FWIW, m is declared as const int& m. What makes this even more puzzling is that just above in the same code I have:

for(i=1;i<(m-1);i++)    a2[i]=(x[i]+x[i+m-1])*0.5f;

and the loop above gets vectorized just fine (here a2 and x are floats). I'm compiling with the

-Ofast -ftree-vectorizer-verbose=10 -mtune=native -march=native

flags on GCC 4.8.1 on a i7.

Thanks in advance,

Edit:

After @nodakai idea, I tried this:

const int mm = m;
for(i=0;i<(m-1);i++)    N4[i]=(i+m-1-Rigta[i]-1-N3[i])/N0;

this didn't get me quiet there:

babar.cpp:234: note: not vectorized: relevant stmt not supported: D.55255_812 = D.55254_811 / N0_34;
babar.cpp:234: note: bad operation or unsupported loop bound.

so of course, I tried:

const int mm=m;
const float G0=1.0f/(float)N0;
for(i=0;i<(mm-1);i++)   N4[i]=(i+mm-1-Rigta[i]-1-N3[i])*G0;

which then produced:

babar.cpp:235: note: LOOP VECTORIZED.

(e.g. success). Oddly enough, the mm seems necessary(?!).

有帮助吗?

解决方案

Can you try these two steps and see if there's any differences?

  1. insert const int mm = m; just before the loop.
  2. replace all the occurences of m with mm.

其他提示

Your loop bounds probably do not divide by the vectorization factor. Note that in the loop that vectorizes, the loop iterates for one less time than the one that does not. As a simple test to see if this is the case, you can change the starting point of your non-vectorized loop to 1 and then do the 0 case prior to the loop, like:

N4[0] = (m - 1 - Rigta[0] - 1 - N3[0]) / N0;
for(i=1; i<(m-1); i++) {
    N4[i]=(i + m - 1 - Rigta[i] - 1 - N3[i])/N0;
}
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top