Question

Please understand that I'm still learning the concepts of C++11's move and rvalue semantics. My question is whether legacy code can get a free lunch of avoiding unnecessary copy by simply using C++11 compilers and STL.

Here is a very simple example. This code builds a simple character frequency table for a given string. For example, "apple" should return {('a', 1), ('e', 1), ('l', 1), ('p', 2)}. As you will see, I'm just using vectors as values.

typedef std::tuple<char, int> Frequency;
typedef std::vector<Frequency> Frequencies;

Frequencies buildFrequenciesTable(std::string w) {
  char table['z' - 'a' + 1] = { 0, };
  std::for_each(w.cbegin(), w.cend(), [&table](char c) {
    ++table[::tolower(c) - 'a'];
  });

  Frequencies freqs;
  for (size_t i = 0; i < 'z' - 'a' + 1; ++i) {
    if (table[i] != 0)
      freqs.push_back(tuple<char, int>((char) ('a' + i), table[i]));
  }
  return freqs; // Q1: Is vector get copied?
}

int main() {
  using namespace std;

  Frequencies f1 = buildFrequenciesTable("apple"); // Q2: Copy?
  Frequencies f2 = buildFrequenciesTable("banana");
  vector<Frequencies> fs = { f1, f2 }; // Q3: Copy?
}

It is clear that C++03 generates all copy code (using copy constructors and assignment operators) when returning vectors as value. What about in C++11? std::vector a has move constructor. Can this code get avoid any unnessarry copies? Or, should I use either && or std::forward in the above code?

I tried to debug the internal STL code, but it was hard to convince.

Note: My goal is to minimize any unnecessary copies in these functions. I know I could use new/pointers/references, but this will need to resolve memory leak problem. So, I'd like to use values as much as possible.

Was it helpful?

Solution

For Q1, there is most likely no copy even in C++03 as the copy is removed by the "Named Return Value Optimization" (NRVO).

For Q2, there is also most likely no copy even in C++03 as copy-elision removes it.

For Q3, even in C++11 you do have copies as you would need to mark f1 and f2 as movable in order to actually have them moved:

vector<Frequencies> fs = { std::move(f1), std::move(f2) };

Since you asked multiple question I think I will omit further explanations, look up NRVO, copy-elision and where std::move is required, ask if you have any further questions.

However, there are cases where you would get free moves, for example if there is a temporary which could be moved:

vector<Frequencies> fs = { buildFrequenciesTable("apple"),
                           buildFrequenciesTable("bananas") };

The above would detect the two vectors returned from buildFrequenciesTable() as temporaries and therefore they will be moved into fs.

OTHER TIPS

Returning the vector from the function (Q1) will use move semantics where possible, without needing to modify the code. Likewise initialising the vectors from the returned temporary (Q2) will use move semantics; the return value is an rvalue, so can be moved from. In practice, both moves (or, historically, copies) should be elided, so that the function initialises f1 and f2 directly with no moving or copying.

Putting them in the vector (Q3) does require copying: variables are lvalues, which can't be implicitly moved. So you would have to use std::move, or restructure the code, to avoid these copies.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top