Why is RVO disallowed when returning a parameter?

https://stackoverflow.com/questions/9444485

12-11-2019
|

Question

It's stated in [C++11: 12.8/31] :

This elision of copy/move operations, called copy elision, is permitted [...] :

— in a return statement in a function with a class return type, when the expression is the name of a non-volatile automatic object (other than a function or catch-clause parameter) with the same cv-unqualified type as the function return type, the copy/move operation can be omitted by constructing the automatic object directly into the function’s return value

This implies

#include <iostream>

using namespace std;

struct X
{
    X() { }
    X(const X& other) { cout << "X(const X& other)" << endl; }
};

X no_rvo(X x) {
    cout << "no_rvo" << endl;
    return x;
}

int main() {
    X x_orig;
    X x_copy = no_rvo(x_orig);

    return 0;
}

will print

X(const X& other)
no_rvo
X(const X& other)

Why is the second copy constructor required? Can't a compiler simply extend the lifetime of x?

Solution

Imagine no_rvo is defined in a different file than main so that when compiling main the compiler will only see the declaration

X no_rvo(X x);

and will have no idea whether the object of type X returned has any relation to the argument. From what it knows at that point, the implementation of no_rvo could as well be

X no_rvo(X x) { X other; return other; }

So when it e.g. compiles the line

X const& x = no_rvo(X());

it will do the following, when maximally optimizing.

Generate the temporary X to be passed to no_rvo as argument
call no_rvo, and bind its return value to x
destruct the temporary object it passed to no_rvo.

Now if the return value from no_rvo would be the same object as the object passed to it, then destruction of the temporary object would mean destruction of the returned object. But that would be wrong because the returned object is bound to a reference, therefore extending its lifetime beyond that statement. However simply not destructing the argument is also no solution because that would be wrong if the definition of no_rvo is the alternative implementation I've shown above. So if the function is allowed to reuse an argument as return value, there can arise situations where the compiler could not determine the correct behaviour.

Note that with common implementations, the compiler would not be able to optimize that away anyways, therefore it is not such a big loss that it is not formally allowed. Also note that the compiler is allowed to optimize the copy away anyway if it can prove that this doesn't lead to a change in observable behaviour (the so-called as-if rule).

OTHER TIPS

The usual implementation of RVO is that the calling code passes the address of a memory chunk where the function should construct its result object.

When the function result is directly an automatic variable that is not a formal argument, that that local variable can simply be placed in the caller-provided memory chunk, and the return statement then does no copying at all.

For an argument passed by value the calling machine code has to copy-initialize its actual argument into the formal argument’s location before jumping to the function. For the function to place its result there it would have to destroy the formal argument object first, which has some tricky special cases (e.g., when that construction directly or indirectly refers to the formal argument object). So, instead of identifying the result location with the formal argument location, an optimization here logically has to use a separate called-provided memory chunk for the function result.

However, a function result that is not passed in a register is normally provided by the caller. I.e., what one could reasonably talk about as RVO, a kind of diminished RVO, for the case of a return expression that denotes a formal argument, is what would happen anyway. And it does not fit with the text “by constructing the automatic object directly into the function’s return value”.

Summing up, the data flow requiring that the caller passes in a value, means that it is necessarily the caller that initializes a formal argument's storage, and not the function. Hence, copying back from a formal argument can not be avoided in general (that weasel term covers the special cases where the compiler can do very special things, in particular for inlined machine code). However, it is the function that initializes any other local automatic object’s storage, and then it’s no problem to do RVO.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow