What optimization does move semantics provide if we already have RVO?

https://stackoverflow.com/questions/5031778

14-11-2019
|

Question

As far as I understand one of the purposes of adding move semantics is to optimize code by calling special constructor for copying "temporary" objects. For example, in this answer we see that it can be used to optimize such string a = x + y stuff. Because x+y is an rvalue expression, instead of deep copying we can copy only the pointer to the string and the size of the string. But as we know, modern compilers support return value optimization, so without using move semantics our code will not call the copy constructor at all.

To prove it I write this code:

#include <iostream>

struct stuff
{
        int x;
        stuff(int x_):x(x_){}
        stuff(const stuff & g):x(g.x)
        {
                std::cout<<"copy"<<std::endl;
        }
};   
stuff operator+(const stuff& lhs,const stuff& rhs)
{
        stuff g(lhs.x+rhs.x);
        return g;
}
int main()
{
        stuff a(5),b(7);
        stuff c = a+b;
}

And after executing it in VC++2010 and g++ in optimize mode I'm getting empty output.

What kind of optimization is it, if without it my code still works faster? Could you explain what I'm understanding wrong?

Solution 2

After some digging I find this excellent example of optimization with rvalue references inStroustrup's FAQ .

Yes, swap function:

    template<class T> 
void swap(T& a, T& b)   // "perfect swap" (almost)
{
    T tmp = move(a);    // could invalidate a
    a = move(b);        // could invalidate b
    b = move(tmp);      // could invalidate tmp
}

This will generate optimized code for any kind of types (assuming, that it have move constructor).

Edit: Also RVO can't optimize something like this(at least on my compiler):

stuff func(const stuff& st)
{
    if(st.x>0)
    {
        stuff ret(2*st.x);
        return ret;
    }
    else
    {
        stuff ret2(-2*st.x);
        return ret2;
    }
}

This function always calls copy constructor (checked with VC++). And if our class can be moved faster, than with move constructor we will have optimization.

OTHER TIPS

Move semantics should not be thought as an optimization device, even if they can be used as such.

If you are going to want copies of objects (either function parameters or return values), then RVO and copy elision will do the job when they can. Move semantics can help, but are more powerful than that.

Move semantics are handy when you want to do something different whether the passed object is a temporary (it then binds to a rvalue reference) or a "standard" object with a name (a so called const lvalue). If you want for instance to steal the resources of a temporary object, then you want move semantics (example: you can steal the contents a std::unique_ptr points to).

Move semantics allow you to return non copyable objects from functions, which is not possible with the current standard. Also, non copyable objects can be put inside other objects, and those objects will automatically be movable if the contained objects are.

Non copyable objects are great, since they don't force you to implement an error-prone copy constructor. A lot of the time, copy semantics do not really make sense, but move semantics do (think about it).

This also enables you to use movable std::vector<T> classes even if T is non copyable. The std::unique_ptr class template is also a great tool when dealing with non copyable objects (eg. polymorphic objects).

Imagine your stuff was a class with heap allocated memory like a string, and that it had the notion of capacity. Give it a operator+= that will grow the capacity geometrically. In C++03 this might look like:

#include <iostream>
#include <algorithm>

struct stuff
{
    int size;
    int cap;

    stuff(int size_):size(size_)
    {
        cap = size;
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    stuff(const stuff & g):size(g.size), cap(g.cap)
    {
        if (cap > 0)
            std::cout <<"allocating " << cap <<std::endl;
    }
    ~stuff()
    {
        if (cap > 0)
            std::cout << "deallocating " << cap << '\n';
    }

    stuff& operator+=(const stuff& y)
    {
        if (cap < size+y.size)
        {
            if (cap > 0)
                std::cout << "deallocating " << cap << '\n';
            cap = std::max(2*cap, size+y.size);
            std::cout <<"allocating " << cap <<std::endl;
        }
        size += y.size;
        return *this;
    }
};

stuff operator+(const stuff& lhs,const stuff& rhs)
{
    stuff g(lhs.size + rhs.size);
    return g;
}

Also imagine you want to add more than just two stuff's at a time:

int main()
{
    stuff a(11),b(9),c(7),d(5);
    std::cout << "start addition\n\n";
    stuff e = a+b+c+d;
    std::cout << "\nend addition\n";
}

For me this prints out:

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
allocating 27
allocating 32
deallocating 27
deallocating 20

end addition
deallocating 32
deallocating 5
deallocating 7
deallocating 9
deallocating 11

I count 3 allocations and 2 deallocations to compute:

stuff e = a+b+c+d;

Now add move semantics:

    stuff(stuff&& g):size(g.size), cap(g.cap)
    {
        g.cap = 0;
        g.size = 0;
    }

...

stuff operator+(stuff&& lhs,const stuff& rhs)
{
        return std::move(lhs += rhs);
}

Running again I get:

allocating 11
allocating 9
allocating 7
allocating 5
start addition

allocating 20
deallocating 20
allocating 40

end addition
deallocating 40
deallocating 5
deallocating 7
deallocating 9
deallocating 11

I'm now down to 2 allocations and 1 deallocations. That translates to faster code.

There are many places some of which are mentioned in other answers.

One big one is that when resizing a std::vector it will move move-aware objects from the old memory location to the new one rather than copy and destroy the original.

Additionally rvalue references allow the concept of movable types, this is a semantic difference and not just an optimization. unique_ptr wasn't possible in C++03 which is why we had the abomination of auto_ptr.

Just because this particular case is already covered by an existing optimization does not mean that other cases don't exist where r-value references are helpful.

Move construction allows optimization even when the temporary is returned from a function which cannot be inlined (perhaps it's a virtual call, or through a function pointer).

Your posted example only takes const lvalue references and so explicitly cannot have move semantics applied to it, as there is not a single rvalue reference in there. How can move semantics make your code faster when you implemented a type without rvalue references?

In addition, your code is already covered by RVO and NRVO. Move semantics apply to far, far more situations than those two do.

This line calls the first constructor.

stuff a(5),b(7);

Plus operator is called using explicit common lvalue references.

stuff c = a + b;

Inside operator overload method, you have no copy constructor called. Again, the first constructor is called only.

stuff g(lhs.x+rhs.x);

assigment is made with RVO, so no copy is need. NO copy from returned object to 'c' is need.

stuff c = a+b;

Due no std::cout reference, compiler take care about your c value is never used. Then, whole program is stripped out, resulting in a empty program.

Another good example I can think of. Imagine that you're implementing a matrix library and write an algorithm which takes two matrices and outputs another one:

Matrix MyAlgorithm(Matrix U, Matrix V)
{
    Transform(U); //doesn't matter what this actually does, but it modifies U
    Transform(V);
    return U*V;
}

Note that you can't pass U and V by const reference, because the algorithm tweaks them. You could theoretically pass them by reference, but this would look gross and leave U and V in some intermediate state (since you call Transform(U)), which may not make any sense to the caller, or just not make any mathematical sense at all, since it's just one of the internal algorithm transformations. The code looks much cleaner if you just pass them by value and use move semantics if you are not going to use U and V after calling this function:

Matrix u, v;
...
Matrix w = MyAlgorithm(u, v); //slow, but will preserve u and v
Matrix w = MyAlgorithm(move(u), move(v)); //fast, but will nullify u and v
Matrix w = MyAlgorithm(u, move(v)); //and you can even do this if you need one but not the other

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow