Loop unrolling won't give you any benefit for this code, because the overhead of the function call to printf()
itself dominates the work done at each iteration. The compiler may be aware of this, and since it is being asked to optimize the code, it may decide that unrolling increases the code size for no appreciable run-time performance gain, and decides the risk of incurring an instruction cache miss is too high to perform the unrolling.
The type of unrolling required to speed up this loop would require reducing the number of calls to printf()
itself. I am unaware of any optimizing compiler that is capable of doing that.
As an example of unrolling the loop to reduce the number of printf()
calls, consider this code:
void print_loop_unrolled (int n) {
int i = -8;
if (n % 8) {
printf("%.*s", n % 8, "01234567");
i += n % 8;
}
while ((i += 8) < n) {
printf("%d%d%d%d%d%d%d%d",i,i+1,i+2,i+3,i+4,i+5,i+6,i+7);
}
}