Question

Our lecturer has explained that this piece of function which calculates the length of a character string...

int strlen_1(const char *str) {
    const char *temp = str;
    while(*temp != '\0') {
        temp++;
    }
    return temp - str;
}

... will calculate it faster than this one...

int strlen_03(const char *str) {
    int i;
    for (i = 0; *(str+i) != '\0'; i++);
    return i;

I think he said that it was related to arithmetic calculations, something like in the first one any arithmetic calculus is done, but I cannot understand that, I see them both in the same level. Could you please explain me the reason in other words?

PS. I do understand pointers and I can understand what is going on, it is like stepping through the elements of the array stored into the "RAM cells" by one unit.

Thanks in advance.

Était-ce utile?

La solution

Ignoring the optimization for a moment and just looking at the paper algorithms:

The former performs this calculation repeatedly:

addr++

with a result calculated by a difference calculation

addr1 - addr0

the latter performs these calculations repeated

addr0 + i
i++

with the result being calculated by a value-return

i

In other words, twice as much work is being done in the loop for the benefit of doing half as much in calculating the final result.

getting to the optimized ASM, the first generates this at -O3 on my clang:

0x100000ee4:  cmpb   $0, 1(%rbx)
0x100000ee8:  leaq   1(%rbx), %rbx
0x100000eec:  sete   %al
0x100000eef:  testb  $1, %al
0x100000ef1:  je     0x100000ee4

the second generates this:

0x100000f09:  incl   %ebx
0x100000f0b:  cmpb   $0, (%rax)
0x100000f0e:  leaq   1(%rax), %rax
0x100000f12:  sete   %cl
0x100000f15:  testb  $1, %cl
0x100000f18:  je     0x100000f09

I left out the constant one-timers for the return values because they are not core to the complexity of the loop. The optimizer is pretty good, noting the only major difference is that single:

0x100000f09:  incl   %ebx

which is your i

Autres conseils

This is a micro-optimization, and a modern compiler would likely end up generating the same assembly for both, but for a non-optimized version, here is why:

int strlen_1(const char *str) 
{
    const char *temp = str; // declare the iterator
    while(*temp != '\0')   // dereference the pointer
                           // test the iterator 
    {
        temp++; // increment the iterator
    }
    return temp - str; // pointer subtraction
}

For a string of length N, this gives you 3N + 2 operations.

int strlen_03(const char *str) 
{
    int i; // declare your iterator
    for (i = 0; *(str+i) != '\0'; i++); // initialize the iterator
                                        // add i to str
                                        // dereference that pointer value
                                        // test it against \0
                                        // increment i
    return i;
}

For the same string, this gives you 4N + 2 operations.

Again, a modern compiler will likely fix this for you, and this small loop isn't likely to make much of a difference even in the un-optimized form for most strings (only for very long strings).

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top