How to vectorize inner loops with omp simd

Question

There is no more place for this in the comments:

I get this when I compile it at an Ivy Bridge CPU. The loop on line 15 is not profitable to be vectorized on the CPU, but notice it IS VECTORIZED for the Intel MIC architecture. The loop 16 is vectorized on the CPU also with the target directives removed.

The reason for the vectorization problem is in the first remark "subscript too complex".

ifort -openmp simd.f90 -warn -O3 -c -vec-report=3 -xHOST -fpp 
ifort: command line remark #10382: option '-xHOST' setting '-xCORE-AVX-I'
simd.f90(17): (col. 33) remark: loop was not vectorized: subscript too complex
simd.f90(15): (col. 5) warning #13379: loop was not vectorized with "simd"
simd.f90(16): (col. 8) remark: LOOP WAS VECTORIZED
simd.f90(13): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(13): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(31): (col. 4) remark: LOOP WAS VECTORIZED
simd.f90(30): (col. 3) remark: loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: BLOCK WAS VECTORIZED
ifort: warning #10362: Environment configuration problem encountered.  Please check for proper MPSS installation and environment setup.
simd.f90(15): (col. 5) remark: *MIC* OpenMP SIMD LOOP WAS VECTORIZED
simd.f90(13): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(13): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(31): (col. 4) remark: *MIC* LOOP WAS VECTORIZED
simd.f90(31): (col. 4) remark: *MIC* PEEL LOOP WAS VECTORIZED
simd.f90(31): (col. 4) remark: *MIC* REMAINDER LOOP WAS VECTORIZED
simd.f90(30): (col. 3) remark: *MIC* loop was not vectorized: not inner loop
simd.f90(29): (col. 7) remark: *MIC* loop was not vectorized: not inner loop