Question

This really is a question just for my own interest I haven't been able to determine through the documentation.

I see on http://www.cplusplus.com/reference/string/string/ that append has complexity:

"Unspecified, but generally up to linear in the new string length."

while push_back() has complexity:

"Unspecified; Generally amortized constant, but up to linear in the new string length."

As a toy example, suppose I wanted to append the characters "foo" to a string. Would

myString.push_back('f');
myString.push_back('o');
myString.push_back('o');

and

myString.append("foo");

amount to exactly the same thing? Or is there any difference? You might figure that append would be more efficient because the compiler would know how much memory is required to extend the string the specified number of characters, while push_back may need to secure memory each call?

Was it helpful?

Solution

In C++03 (for which most of "cplusplus.com"'s documentation is written), the complexities were unspecified because library implementers were allowed to do Copy-On-Write or "rope-style" internal representations for strings. For instance, a COW implementation might require copying the entire string if a character is modified and there is sharing going on.

In C++11, COW and rope implementations are banned. You should expect constant amortized time per character added or linear amortized time in the number of characters added for appending to a string at the end. Implementers may still do relatively crazy things with strings (in comparison to, say std::vector), but most implementations are going to be limited to things like the "small string optimization".

In comparing push_back and append, push_back deprives the underlying implementation of potentially useful length information which it might use to preallocate space. On the other hand, append requires that an implementation walk over the input twice in order to find that length, so the performance gain or loss is going to depend on a number of unknowable factors such as the length of the string before you attempt the append. That said, the difference is probably extremely Extremely EXTREMELY small. Go with append for this -- it is far more readable.

OTHER TIPS

I had the same doubt, so I made a small test to check this (g++ 4.8.5 with C++11 profile on Linux, Intel, 64 bit under VmWare Fusion).

And the result is interesting:

push :19
append :21
++++ :34

Could be possible this is because of the string length (big), but the operator + is very expensive compared with the push_back and the append.

Also it is interesting that when the operator only receives a character (not a string), it behaves very similar to the push_back.

For not to depend on pre-allocated variables, each cycle is defined in a different scope.

Note : the vCounter simply uses gettimeofday to compare the differences.

TimeCounter vCounter;

{
    string vTest;

    vCounter.start();
    for (int vIdx=0;vIdx<1000000;vIdx++) {
        vTest.push_back('a');
        vTest.push_back('b');
        vTest.push_back('c');
    }
    vCounter.stop();
    cout << "push :" << vCounter.elapsed() << endl;
}

{
    string vTest;

    vCounter.start();
    for (int vIdx=0;vIdx<1000000;vIdx++) {
        vTest.append("abc");
    }
    vCounter.stop();
    cout << "append :" << vCounter.elapsed() << endl;
}

{
    string vTest;

    vCounter.start();
    for (int vIdx=0;vIdx<1000000;vIdx++) {
        vTest += 'a';
        vTest += 'b';
        vTest += 'c';
    }
    vCounter.stop();
    cout << "++++ :" << vCounter.elapsed() << endl;
}

Add one more opinion here.

I personally consider it better to use push_back() when adding characters one by one from another string. For instance:

string FilterAlpha(const string& s) {
  string new_s;
  for (auto& it: s) {
    if (isalpha(it)) new_s.push_back(it);
  }
  return new_s;
}

If using append()here, I would replace push_back(it) with append(1,it), which is not that readable to me.

Yes, I would also expect append() to perform better for the reasons you gave, and in a situation where you need to append a string, using append() (or operator+=) is certainly preferable (not least also because the code is much more readable).

But what the Standard specifies is the complexity of the operation. And that is generally linear even for append(), because ultimately each character of the string being appended (and possible all characters, if reallocation occurs) needs to be copied (this is true even if memcpy or similar are used).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top