Why is ++i considered an l-value, but i++ is not?

https://stackoverflow.com/questions/371503

21-08-2019
|

Question

Why is ++i is l-value and i++ not?

Solution

Well as another answerer pointed out already the reason why ++i is an lvalue is to pass it to a reference.

int v = 0;
int const & rcv = ++v; // would work if ++v is an rvalue too
int & rv = ++v; // would not work if ++v is an rvalue

The reason for the second rule is to allow to initialize a reference using a literal, when the reference is a reference to const:

void taking_refc(int const& v);
taking_refc(10); // valid, 10 is an rvalue though!

Why do we introduce an rvalue at all you may ask. Well, these terms come up when building the language rules for these two situations:

We want to have a locator value. That will represent a location which contains a value that can be read.
We want to represent the value of an expression.

The above two points are taken from the C99 Standard which includes this nice footnote quite helpful:

[ The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left operand E1 is required to be a (modiﬁable) lvalue. It is perhaps better considered as representing an object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described as the ‘‘value of an expression’’. ]

The locator value is called lvalue, while the value resulting from evaluating that location is called rvalue. That's right according also to the C++ Standard (talking about the lvalue-to-rvalue conversion):

4.1/2: The value contained in the object indicated by the lvalue is the rvalue result.

Conclusion

Using the above semantics, it is clear now why i++ is no lvalue but an rvalue. Because the expression returned is not located in i anymore (it's incremented!), it is just the value that can be of interest. Modifying that value returned by i++ would make not sense, because we don't have a location from which we could read that value again. And so the Standard says it is an rvalue, and it thus can only bind to a reference-to-const.

However, in constrast, the expression returned by ++i is the location (lvalue) of i. Provoking an lvalue-to-rvalue conversion, like in int a = ++i; will read the value out of it. Alternatively, we can make a reference point to it, and read out the value later: int &a = ++i;.

Note also the other occasions where rvalues are generated. For example, all temporaries are rvalues, the result of binary/unary + and minus and all return value expressions that are not references. All those expressions are not located in an named object, but carry rather values only. Those values can of course be backed up by objects that are not constant.

The next C++ Version will include so-called rvalue references that, even though they point to nonconst, can bind to an rvalue. The rationale is to be able to "steal" away resources from those anonymous objects, and avoid copies doing that. Assuming a class-type that has overloaded prefix ++ (returning Object&) and postfix ++ (returning Object), the following would cause a copy first, and for the second case it will steal the resources from the rvalue:

Object o1(++a); // lvalue => can't steal. It will deep copy.
Object o2(a++); // rvalue => steal resources (like just swapping pointers)

OTHER TIPS

Other people have tackled the functional difference between post and pre increment.

As far as being an lvalue is concerned, i++ can't be assigned to because it doesn't refer to a variable. It refers to a calculated value.

In terms of assignment, both of the following make no sense in the same sort of way:

i++   = 5;
i + 0 = 5;

Because pre-increment returns a reference to the incremented variable rather than a temporary copy, ++i is an lvalue.

Preferring pre-increment for performance reasons becomes an especially good idea when you are incrementing something like an iterator object (eg in the STL) that may well be a good bit more heavyweight than an int.

It seem that a lot of people are explaining how ++i is an lvalue, but not the why, as in, why did the C++ standards committee put this feature in, especially in light of the fact that C doesn't allow either as lvalues. From this discussion on comp.std.c++, it appears that it is so you can take its address or assign to a reference. A code sample excerpted from Christian Bau's post:

   int i;
   extern void f (int* p);
   extern void g (int& p);

   f (&++i);   /* Would be illegal C, but C programmers
                  havent missed this feature */
   g (++i);    /* C++ programmers would like this to be legal */
   g (i++);    /* Not legal C++, and it would be difficult to
                  give this meaningful semantics */

By the way, if i happens to be a built-in type, then assignment statements such as ++i = 10 invoke undefined behavior, because i is modified twice between sequence points.

I'm getting the lvalue error when I try to compile

i++ = 2;

but not when I change it to

++i = 2;

This is because the prefix operator (++i) changes the value in i, then returns i, so it can still be assigned to. The postfix operator (i++) changes the value in i, but returns a temporary copy of the old value, which cannot be modified by the assignment operator.

Answer to original question:

If you're talking about using the increment operators in a statement by themselves, like in a for loop, it really makes no difference. Preincrement appears to be more efficient, because postincrement has to increment itself and return a temporary value, but a compiler will optimize this difference away.

for(int i=0; i<limit; i++)
...

is the same as

for(int i=0; i<limit; ++i)
...

Things get a little more complicated when you're using the return value of the operation as part of a larger statement.

Even the two simple statements

int i = 0;
int a = i++;

and

int i = 0;
int a = ++i;

are different. Which increment operator you choose to use as a part of multi-operator statements depends on what the intended behavior is. In short, no you can't just choose one. You have to understand both.

POD Pre increment:

The pre-increment should act as if the object was incremented before the expression and be usable in this expression as if that happened. Thus the C++ standards comitee decided it can also be used as an l-value.

POD Post increment:

The post-increment should increment the POD object and return a copy for use in the expression (See n2521 Section 5.2.6). As a copy is not actually a variable making it an l-value does not make any sense.

Objects:

Pre and Post increment on objects is just syntactic sugar of the language provides a means to call methods on the object. Thus technically Objects are not restricted by the standard behavior of the language but only by the restrictions imposed by method calls.

It is up to the implementor of these methods to make the behavior of these objects mirror the behavior of the POD objects (It is not required but expected).

Objects Pre-increment:

The requirement (expected behavior) here is that the objects is incremented (meaning dependant on object) and the method return a value that is modifiable and looks like the original object after the increment happened (as if the increment had happened before this statement).

To do this is siple and only require that the method return a reference to it-self. A reference is an l-value and thus will behave as expected.

Objects Post-increment:

The requirement (expected behavior) here is that the object is incremented (in the same way as pre-increment) and the value returned looks like the old value and is non-mutable (so that it does not behave like an l-value).

Non-Mutable:
To do this you should return an object. If the object is being used within an expression it will be copy constructed into a temporary variable. Temporary variables are const and thus it will non-mutable and behave as expected.

Looks like the old value:
This is simply achieved by creating a copy of the original (probably using the copy constructor) before makeing any modifications. The copy should be a deep copy otherwise any changes to the original will affect the copy and thus the state will change in relationship to the expression using the object.

In the same way as pre-increment:
It is probably best to implement post increment in terms of pre-increment so that you get the same behavior.

class Node // Simple Example
{
     /*
      * Pre-Increment:
      * To make the result non-mutable return an object
      */
     Node operator++(int)
     {
         Node result(*this);   // Make a copy
         operator++();         // Define Post increment in terms of Pre-Increment

         return result;        // return the copy (which looks like the original)
     }

     /*
      * Post-Increment:
      * To make the result an l-value return a reference to this object
      */
     Node& operator++()
     {
         /*
          * Update the state appropriatetly */
         return *this;
     }
};

Regarding LValue

In C (and Perl for instance), neither ++i nor i++ are LValues.
In C++, i++ is not and LValue but ++i is.

++i is equivalent to i += 1, which is equivalent to i = i + 1.
The result is that we're still dealing with the same object i.
It can be viewed as:
```
int i = 0;
++i = 3;  
// is understood as
i = i + 1;  // i now equals 1
i = 3;
```
i++ on the other hand could be viewed as:
First we use the value of i, then increment the object i.
```
int i = 0;
i++ = 3;  
// would be understood as 
0 = 3  // Wrong!
i = i + 1;
```

(edit: updated after a blotched first-attempt).

The main difference is that i++ returns the pre-increment value whereas ++i returns the post-increment value. I normally use ++i unless I have a very compelling reason to use i++ - namely, if I really do need the pre-increment value.

IMHO it is good practise to use the '++i' form. While the difference between pre- and post-increment is not really measurable when you compare integers or other PODs, the additional object copy you have to make and return when using 'i++' can represent a significant performance impact if the object is either quite expensive to copy, or incremented frequently.

By the way - avoid using multiple increment operators on the same variable in the same statement. You get into a mess of "where are the sequence points" and undefined order of operations, at least in C. I think some of that was cleaned up in Java nd C#.

Maybe this has something to do with the way the post-increment is implemented. Perhaps it's something like this:

Create a copy of the original value in memory
Increment the original variable
Return the copy

Since the copy is neither a variable nor a reference to dynamically allocated memory, it can't be a l-value.

How does the compiler translate this expression? a++

We know that we want to return the unincremented version of a, the old version of a before the increment. We also want to increment a as a side effect. In other words, we are returning the old version of a, which no longer represents the current state of a, it no longer is the variable itself.

The value which is returned is a copy of a which is placed into a register. Then the variable is incremented. So here you are not returning the variable itself, but you are returning a copy which is a separate entity! This copy is temporarily stored inside a register and then it is returned. Recall that a lvalue in C++ is an object that has an identifiable location in memory. But the copy is stored inside a register in the CPU, not in memory. All rvalues are objects which do not have an identifiable location in memory. That explains why the copy of the old version of a is an rvalue, because it gets temporarily stored in a register. In general, any copies, temporary values, or the results of long expressions like (5 + a) * b are stored in registers, and then they are assigned into the variable, which is a lvalue.

The postfix operator must store the original value into a register so that it can return the unincremented value as its result. Consider the following code:

for (int i = 0; i != 5; i++) {...}

This for-loop counts up to five, but i++ is the most interesting part. It is actually two instructions in 1. First we have to move the old value of i into the register, then we increment i. In pseudo-assembly code:

mov i, eax
inc i

eax register now contains the old version of i as a copy. If the variable i resides in the main memory, it might take the CPU a lot of time to go and get the copy all the way from the main memory and move it into the register. That is usually very fast for modern computer systems, but if your for-loop iterates a hundred thousand times, all those extra operations start to add up! It would be a significant performance penalty.

Modern compilers are usually smart enough to optimize away this extra work for integer and pointer types. For more complicated iterator types, or maybe class types, this extra work potentially might be more costly.

What about the prefix increment ++a?

We want to return the incremented version of a, the new version of a after the increment. The new version of a represents the current state of a, because it is the variable itself.

First a is incremented. Since we want to get the updated version of a, why not just return the variable a itself? We do not need to make a temporary copy into the register to generate an rvalue. That would require unnecessary extra work. So we just return the variable itself as an lvalue.

If we don't need the unincremented value, there's no need for the extra work of copying the old version of a into a register, which is done by the postfix operator. That is why you should only use a++ if you really need to return the unincremented value. For all other purposes, just use ++a. By habitually using the prefix versions, we do not have to worry about whether the performance difference matters.

Another advantage of using ++a is that it expresses the intent of the program more directly: I just want to increment a! However, when I see a++ in someone else's code, I wonder why do they want to return the old value? What is it for?

C#:

public void test(int n)
{
  Console.WriteLine(n++);
  Console.WriteLine(++n);
}

/* Output:
n
n+2
*/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow