Вопрос

Sometimes it's wise to split complicated or long expressions into multiple steps, for example (the 2nd version isn't more clear, but it's just an example):

return object1(object2(object3(x)));

can be written as:

object3 a(x);
object2 b(a);
object1 c(b);
return c;

Assuming all 3 classes implement constructors that take rvalue as a parameter, the first version might be faster, because temporary objects are passed and can be moved. I'm assuming that in the 2nd version, the local variables are considered to be lvalues. But if the variables aren't later used, do C++11 compilers optimize the code so the variables are considered to be rvalues and both versions work exactly the same? I'm mostly interested in Visual Studio 2013's C++ compiler, but I'm also happy know how the GCC compiler behaves in this matter.

Thanks, Michal

Это было полезно?

Решение

The compiler cannot break the "as-if" rule in this case. But you can use std::move to achieve the desired effect:

object3 a(x);
object2 b(std::move(a));
object1 c(std::move(b));
return c;

Другие советы

As juanchopanza said, the compiler cannot (at C++ level) violate the "as-if" rule; that is all transformations should produce a semantically equivalent code.

However, beyond the C++ level, when the code is optimized, further opportunities may arise.

As such, it really depends on the objects themselves: if the move-constructors/destructors have side effects, and (de)allocating memory is a side effect, then the optimization cannot occur. If you use only PODs, with default move-constructors/destructors, then it will probably be automatically optimized.

But if the variables aren't later used, do C++11 compilers optimize the code so the variables are considered to be rvalues and both versions work exactly the same?

It is possible but it greatly depends on your types. Consider the following example with a POD type point:

#include <cstdio>

struct point {
  int x;
  int y;
};

static point translate(point p, int dx, int dy) {
  return { p.x + dx, p.y + dy };
}

static point mirror(point p) {
  return { -p.x, -p.y };
}

static point make_point(int x, int y) {
  return { x, y };
}

int main() {
  point a = make_point(1, 2);
  point b = translate(a, 3, 3);
  point c = mirror(b);

  std::printf("(x,y) = (%d,%d)\n", c.x, c.y);
}

I looked at the assembly code, here is what the whole program(!) was basically compiled into (so the code below is a C approximation of the generated assembly code):

int main() {
  std::printf("(x,y) = (-4,-5)\n");
}

It not only got rid of all the local variables, it also did the computations at compile time! I have tried both gcc and clang but not msvc.

OK, so let's make the program a little more complicated so that it cannot do the computations:

int main(int argc, char* argv[]) {

  int x = *argv[1]-'0';
  int y = *argv[2]-'0';
  point a = make_point(x,y);
  point b = translate(a, 3, 3);
  point c = mirror(b);

  std::printf("(x,y) = (%d,%d)\n", c.x, c.y);
}

To run this code, you would have to call it like ./a.out 1 2.

This whole program is reduced to this one (assembly rewritten in C) after optimization:

int main(int argc, char* argv[]) {
  int x = *argv[1]-'0';
  int y = *argv[2]-'0';
  std::printf("(x,y) = (%d,%d)\n", -(x+3), -(y+3));
}

So it got rid of a, b, c and all the functions make_point(), translate() and mirror() and did as much computions as possible at compile time.

For the reasons mentioned in Matthieu M.'s answer, don't expect to happen so good optimizations with more complicated types (especially non-PODs).

In my experience, inlining is crucial. Work hard so that your functions can be easily inlined. Use link time optimizations.

Be aware that besides move semantics that can greately speed up your code, compiler is also doing (N)RVO - (Named) Return Value Optimization, which can actually give even more efficiency to your code. I have tested your example and in g++4.8 it appears that your second example could be actually more optimal:

object3 a(x);
object2 b(a);
object1 c(b);
return c;

From my experiments it looks like it would call constructor/destructor 8 times (1 ctr + 2 copy ctrs + 1 move ctr + 4 dtrs), compared to other method that call it 10 times (1 ctr + 4 move ctors + 5 dtors). But as user2079303 has commented, move constructors should still outperform copy constructors, also in this example all calls will be inlined so no function call overhead would take place.

Copy/move elision is actually an exception to "as-if" rule, that means that sometimes you may be suprised that your constructor/destructor even tho with side effects does not get called.

http://coliru.stacked-crooked.com/a/1ca7ebec0567e48f

(you can disable (N)RVO with -fno-elide-constructors parameter)

#include <iostream>
#include <memory>

template<int S>
struct A {
    A() { std::cout<<"A::A"<<std::endl; }    
    template<int S2>
    A(const A<S2>&) { std::cout<<"A::A&"<<std::endl; }
    template<int S2>
    A(const A<S2>&&) { std::cout<<"A::A&&"<<std::endl; }    
    ~A() { std::cout<<"~A::A"<<std::endl;}        
};
A<0> foo () {    
    A<2> a; A<1> b(a); A<0> c(b); return c;   // calls dtor/ctor 8 times
    //return A<0>(A<1>(A<2>()));  // calls dtor/ctor 10 times
}
int main()
{
   A<0> a=foo();
   return 0;
}
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top