Is there a performance difference between i++ and ++i in C++?

https://stackoverflow.com/questions/24901

09-06-2019
|

Question

We have the question is there a performance difference between i++ and ++i in C?

What's the answer for C++?

Solution

[Executive Summary: Use ++i if you don't have a specific reason to use i++.]

For C++, the answer is a bit more complicated.

If i is a simple type (not an instance of a C++ class), then the answer given for C ("No there is no performance difference") holds, since the compiler is generating the code.

However, if i is an instance of a C++ class, then i++ and ++i are making calls to one of the operator++ functions. Here's a standard pair of these functions:

Foo& Foo::operator++()   // called for ++i
{
    this->data += 1;
    return *this;
}

Foo Foo::operator++(int ignored_dummy_value)   // called for i++
{
    Foo tmp(*this);   // variable "tmp" cannot be optimized away by the compiler
    ++(*this);
    return tmp;
}

Since the compiler isn't generating code, but just calling an operator++ function, there is no way to optimize away the tmp variable and its associated copy constructor. If the copy constructor is expensive, then this can have a significant performance impact.

OTHER TIPS

Yes. There is.

The ++ operator may or may not be defined as a function. For primitive types (int, double, ...) the operators are built in, so the compiler will probably be able to optimize your code. But in the case of an object that defines the ++ operator things are different.

The operator++(int) function must create a copy. That is because postfix ++ is expected to return a different value than what it holds: it must hold its value in a temp variable, increment its value and return the temp. In the case of operator++(), prefix ++, there is no need to create a copy: the object can increment itself and then simply return itself.

Here is an illustration of the point:

struct C
{
    C& operator++();      // prefix
    C  operator++(int);   // postfix

private:

    int i_;
};

C& C::operator++()
{
    ++i_;
    return *this;   // self, no copy created
}

C C::operator++(int ignored_dummy_value)
{
    C t(*this);
    ++(*this);
    return t;   // return a copy
}

Every time you call operator++(int) you must create a copy, and the compiler can't do anything about it. When given the choice, use operator++(); this way you don't save a copy. It might be significant in the case of many increments (large loop?) and/or large objects.

Here's a benchmark for the case when increment operators are in different translation units. Compiler with g++ 4.5.

Ignore the style issues for now

// a.cc
#include <ctime>
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};

int main () {
    Something s;

    for (int i=0; i<1024*1024*30; ++i) ++s; // warm up
    std::clock_t a = clock();
    for (int i=0; i<1024*1024*30; ++i) ++s;
    a = clock() - a;

    for (int i=0; i<1024*1024*30; ++i) s++; // warm up
    std::clock_t b = clock();
    for (int i=0; i<1024*1024*30; ++i) s++;
    b = clock() - b;

    std::cout << "a=" << (a/double(CLOCKS_PER_SEC))
              << ", b=" << (b/double(CLOCKS_PER_SEC)) << '\n';
    return 0;
}

O(n) increment

Test

// b.cc
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};


Something& Something::operator++()
{
    for (auto it=data.begin(), end=data.end(); it!=end; ++it)
        ++*it;
    return *this;
}

Something Something::operator++(int)
{
    Something ret = *this;
    ++*this;
    return ret;
}

Results

Results (timings are in seconds) with g++ 4.5 on a virtual machine:

Flags (--std=c++0x)       ++i   i++
-DPACKET_SIZE=50 -O1      1.70  2.39
-DPACKET_SIZE=50 -O3      0.59  1.00
-DPACKET_SIZE=500 -O1    10.51 13.28
-DPACKET_SIZE=500 -O3     4.28  6.82

O(1) increment

Test

Let us now take the following file:

// c.cc
#include <array>
class Something {
public:
    Something& operator++();
    Something operator++(int);
private:
    std::array<int,PACKET_SIZE> data;
};


Something& Something::operator++()
{
    return *this;
}

Something Something::operator++(int)
{
    Something ret = *this;
    ++*this;
    return ret;
}

It does nothing in the incrementation. This simulates the case when incrementation has constant complexity.

Results

Results now vary extremely:

Flags (--std=c++0x)       ++i   i++
-DPACKET_SIZE=50 -O1      0.05   0.74
-DPACKET_SIZE=50 -O3      0.08   0.97
-DPACKET_SIZE=500 -O1     0.05   2.79
-DPACKET_SIZE=500 -O3     0.08   2.18
-DPACKET_SIZE=5000 -O3    0.07  21.90

Conclusion

Performance-wise

If you do not need the previous value, make it a habit to use pre-increment. Be consistent even with builtin types, you'll get used to it and do not run risk of suffering unecessary performance loss if you ever replace a builtin type with a custom type.

Semantic-wise

i++ says increment i, I am interested in the previous value, though.
++i says increment i, I am interested in the current value or increment i, no interest in the previous value. Again, you'll get used to it, even if you are not right now.

Knuth.

Premature optimization is the root of all evil. As is premature pessimization.

It's not entirely correct to say that the compiler can't optimize away the temporary variable copy in the postfix case. A quick test with VC shows that it, at least, can do that in certain cases.

In the following example, the code generated is identical for prefix and postfix, for instance:

#include <stdio.h>

class Foo
{
public:

    Foo() { myData=0; }
    Foo(const Foo &rhs) { myData=rhs.myData; }

    const Foo& operator++()
    {
        this->myData++;
        return *this;
    }

    const Foo operator++(int)
    {
        Foo tmp(*this);
        this->myData++;
        return tmp;
    }

    int GetData() { return myData; }

private:

    int myData;
};

int main(int argc, char* argv[])
{
    Foo testFoo;

    int count;
    printf("Enter loop count: ");
    scanf("%d", &count);

    for(int i=0; i<count; i++)
    {
        testFoo++;
    }

    printf("Value: %d\n", testFoo.GetData());
}

Whether you do ++testFoo or testFoo++, you'll still get the same resulting code. In fact, without reading the count in from the user, the optimizer got the whole thing down to a constant. So this:

for(int i=0; i<10; i++)
{
    testFoo++;
}

printf("Value: %d\n", testFoo.GetData());

Resulted in the following:

00401000  push        0Ah  
00401002  push        offset string "Value: %d\n" (402104h) 
00401007  call        dword ptr [__imp__printf (4020A0h)]

So while it's certainly the case that the postfix version could be slower, it may well be that the optimizer will be good enough to get rid of the temporary copy if you're not using it.

The Google C++ Style Guide says:

Preincrement and Predecrement

Use prefix form (++i) of the increment and decrement operators with iterators and other template objects.

Definition: When a variable is incremented (++i or i++) or decremented (--i or i--) and the value of the expression is not used, one must decide whether to preincrement (decrement) or postincrement (decrement).

Pros: When the return value is ignored, the "pre" form (++i) is never less efficient than the "post" form (i++), and is often more efficient. This is because post-increment (or decrement) requires a copy of i to be made, which is the value of the expression. If i is an iterator or other non-scalar type, copying i could be expensive. Since the two types of increment behave the same when the value is ignored, why not just always pre-increment?

Cons: The tradition developed, in C, of using post-increment when the expression value is not used, especially in for loops. Some find post-increment easier to read, since the "subject" (i) precedes the "verb" (++), just like in English.

Decision: For simple scalar (non-object) values there is no reason to prefer one form and we allow either. For iterators and other template types, use pre-increment.

I would like to point out an excellent post by Andrew Koenig on Code Talk very recently.

http://dobbscodetalk.com/index.php?option=com_myblog&show=Efficiency-versus-intent.html&Itemid=29

At our company also we use convention of ++iter for consistency and performance where applicable. But Andrew raises over-looked detail regarding intent vs performance. There are times when we want to use iter++ instead of ++iter.

So, first decide your intent and if pre or post does not matter then go with pre as it will have some performance benefit by avoiding creation of extra object and throwing it.

@Ketan

...raises over-looked detail regarding intent vs performance. There are times when we want to use iter++ instead of ++iter.

Obviously post and pre-increment have different semantics and I'm sure everyone agrees that when the result is used you should use the appropriate operator. I think the question is what should one do when the result is discarded (as in for loops). The answer to this question (IMHO) is that, since the performance considerations are negligible at best, you should do what is more natural. For myself ++i is more natural but my experience tells me that I'm in a minority and using i++ will cause less metal overhead for most people reading your code.

After all that's the reason the language is not called "++C".[*]

[*] Insert obligatory discussion about ++C being a more logical name.

Mark: Just wanted to point out that operator++'s are good candidates to be inlined, and if the compiler elects to do so, the redundant copy will be eliminated in most cases. (e.g. POD types, which iterators usually are.)

That said, it's still better style to use ++iter in most cases. :-)

The performance difference between ++i and i++ will be more apparent when you think of operators as value-returning functions and how they are implemented. To make it easier to understand what's happening, the following code examples will use int as if it were a struct.

++i increments the variable, then returns the result. This can be done in-place and with minimal CPU time, requiring only one line of code in many cases:

int& int::operator++() { 
     return *this += 1;
}

But the same cannot be said of i++.

Post-incrementing, i++, is often seen as returning the original value before incrementing. However, a function can only return a result when it is finished. As a result, it becomes necessary to create a copy of the variable containing the original value, increment the variable, then return the copy holding the original value:

int int::operator++(int& _Val) {
    int _Original = _Val;
    _Val += 1;
    return _Original;
}

When there is no functional difference between pre-increment and post-increment, the compiler can perform optimization such that there is no performance difference between the two. However, if a composite data type such as a struct or class is involved, the copy constructor will be called on post-increment, and it will not be possible to perform this optimization if a deep copy is needed. As such, pre-increment generally is faster and requires less memory than post-increment.

++i - faster not using the return value
i++ - faster using the return value

When not using the return value the compiler is guaranteed not to use a temporary in the case of ++i. Not guaranteed to be faster, but guaranteed not to be slower.

When using the return value i++ allows the processor to push both the increment and the left side into the pipeline since they don't depend on each other. ++i may stall the pipeline because the processor cannot start the left side until the pre-increment operation has meandered all the way through. Again, a pipeline stall is not guaranteed, since the processor may find other useful things to stick in.

An the reason why you ought to use ++i even on built-in types where there's no performance advantage is to create a good habit for yourself.

@Mark: I deleted my previous answer because it was a bit flip, and deserved a downvote for that alone. I actually think it's a good question in the sense that it asks what's on the minds of a lot of people.

The usual answer is that ++i is faster than i++, and no doubt it is, but the bigger question is "when should you care?"

If the fraction of CPU time spent in incrementing iterators is less than 10%, then you may not care.

If the fraction of CPU time spent in incrementing iterators is greater than 10%, you can look at which statements are doing that iterating. See if you could just increment integers rather than using iterators. Chances are you could, and while it may be in some sense less desirable, chances are pretty good you will save essentially all the time spent in those iterators.

I've seen an example where the iterator-incrementing was consuming well over 90% of the time. In that case, going to integer-incrementing reduced execution time by essentially that amount. (i.e. better than 10x speedup)

The intended question was about when the result is unused (that's clear from the question for C). Can somebody fix this since the question is "community wiki"?

About premature optimizations, Knuth is often quoted. That's right. but Donald Knuth would never defend with that the horrible code which you can see in these days. Ever seen a = b + c among Java Integers (not int)? That amounts to 3 boxing/unboxing conversions. Avoiding stuff like that is important. And uselessly writing i++ instead of ++i is the same mistake. EDIT: As phresnel nicely puts it in a comment, this can be summed up as "premature optimization is evil, as is premature pessimization".

Even the fact that people are more used to i++ is an unfortunate C legacy, caused by a conceptual mistake by K&R (if you follow the intent argument, that's a logical conclusion; and defending K&R because they're K&R is meaningless, they're great, but they aren't great as language designers; countless mistakes in the C design exist, ranging from gets() to strcpy(), to the strncpy() API (it should have had the strlcpy() API since day 1)).

Btw, I'm one of those not used enough to C++ to find ++i annoying to read. Still, I use that since I acknowledge that it's right.

@wilhelmtell

The compiler can elide the temporary. Verbatim from the other thread:

The C++ compiler is allowed to eliminate stack based temporaries even if doing so changes program behavior. MSDN link for VC 8:

http://msdn.microsoft.com/en-us/library/ms364057(VS.80).aspx

Time to provide folks with gems of wisdom ;) - there is simple trick to make C++ postfix increment behave pretty much the same as prefix increment (Invented this for myself, but the saw it as well in other people code, so I'm not alone).

Basically, trick is to use helper class to postpone increment after the return, and RAII comes to rescue

#include <iostream>

class Data {
    private: class DataIncrementer {
        private: Data& _dref;

        public: DataIncrementer(Data& d) : _dref(d) {}

        public: ~DataIncrementer() {
            ++_dref;
        }
    };

    private: int _data;

    public: Data() : _data{0} {}

    public: Data(int d) : _data{d} {}

    public: Data(const Data& d) : _data{ d._data } {}

    public: Data& operator=(const Data& d) {
        _data = d._data;
        return *this;
    }

    public: ~Data() {}

    public: Data& operator++() { // prefix
        ++_data;
        return *this;
    }

    public: Data operator++(int) { // postfix
        DataIncrementer t(*this);
        return *this;
    }

    public: operator int() {
        return _data;
    }
};

int
main() {
    Data d(1);

    std::cout <<   d << '\n';
    std::cout << ++d << '\n';
    std::cout <<   d++ << '\n';
    std::cout << d << '\n';

    return 0;
}

Invented is for some heavy custom iterators code, and it cuts down run-time. Cost of prefix vs postfix is one reference now, and if this is custom operator doing heavy moving around, prefix and postfix yielded the same run-time for me.

Both are as fast ;) If you want it is the same calculation for the processor, it's just the order in which it is done that differ.

For example, the following code :

#include <stdio.h>

int main()
{
    int a = 0;
    a++;
    int b = 0;
    ++b;
    return 0;
}

Produce the following assembly :

 0x0000000100000f24 <main+0>: push   %rbp
 0x0000000100000f25 <main+1>: mov    %rsp,%rbp
 0x0000000100000f28 <main+4>: movl   $0x0,-0x4(%rbp)
 0x0000000100000f2f <main+11>:    incl   -0x4(%rbp)
 0x0000000100000f32 <main+14>:    movl   $0x0,-0x8(%rbp)
 0x0000000100000f39 <main+21>:    incl   -0x8(%rbp)
 0x0000000100000f3c <main+24>:    mov    $0x0,%eax
 0x0000000100000f41 <main+29>:    leaveq 
 0x0000000100000f42 <main+30>:    retq

You see that for a++ and b++ it's an incl mnemonic, so it's the same operation ;)

When you write i++ you are telling the compiler to increment after it finishes this line or loop.

++i is a little different than i++. In i++ you increment after you finish the loop but ++i you increment directly before the loop finishes.

++i is faster than i++ because it doesn't return an old copy of the value.

It's also more intuitive:

x = i++;  // x contains the old value of i
y = ++i;  // y contains the new value of i

This C example prints "02" instead of the "12" you might expect:

#include <stdio.h>

int main(){
    int a = 0;
    printf("%d", a++);
    printf("%d", ++a);
    return 0;
}

Same for C++:

#include <iostream>
using namespace std;

int main(){
    int a = 0;
    cout << a++;
    cout << ++a;
    return 0;
}

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow