Question

My class has a method with the following prototype:

std::string
Block::get_field(std::string rec_type, std::string field) { ... }

It retrieves a value from a map, converts it to a string, and returns the string.

Someone used my class as follows:

char * buf = block.get_field(my_rec, my_field).c_str();

Unless I'm mistaken, this is setting buf to point into a temporary string. Not very much later, when he looked at *buf, it was corrupted.

Obviously I'd prefer my users write:

std::string buf = block.get_field(my_rec, my_field);

How do I, the author of Block, prevent abuses like this? Or is there a way I can somehow support this usage?

UPDATE 1: This bug did not come to light until I changed the implementation of the map, from holding values directly, to holding "memory references" - a length and a pointer into a buffer. But get_field has always returned a string -- not a string* or a string& -- so I would assume that it always returned a temporary. I don't understand why it didn't break before (I'm also embarrassed; I claimed my changes would not affect the API).

I'm prepared to inform the user that he has to revise his code, but I'd like to be able to cite "The Rule" that he broke, and possibly explain how he happened to "luck out" before now.

UPDATE 2: It seems possible (in ref. to Update 1), that the reason the bug just now appeared is that my "under the hood" changes required me to add the following switch to g++ 4.4.7: -std=gnu++0x, which may have influenced the way in which temporaries are recycled.

Was it helpful?

Solution

Mostly you have to assume that users of your code are competent.

But this is a political world. So maybe sometimes you may have to do a probabilistic best effort to prevent mishaps. E.g.,

#include <assert.h>
#include <string>

class StdString: public std::string
{
private:
    using std::string::c_str;
    using std::string::data;
public:
    StdString( char const* const s ): std::string( s ) { assert( s != nullptr ); }
};

StdString foo() { return "Bah!"; }

int main()
{
    std::string const   a = foo();
    char const* const   b = foo().c_str();      //! Nyet.
}

Maybe throw in a move constructor there for good measure, but generally perfect efficiency is in conflict with perfect safety and perfect clarity, e.g. can has 2 but not 3.

OTHER TIPS

You really shouldn't have to prevent it, but if you see that happening a lot, you can change the prototype to:

void
Block::get_field(std::string& ret, std::string rec_type, std::string field) { ... }

Taking a reference: std::string& ret will force a std::string to exist before calling your method, while they can still do things to corrupt their memory it will happen in "their code". Something as simple as this bad code is very hard to prevent:

char * foo()
{
    std::string buffStr;
    block.get_field(bufStr, my_rec, my_field)
    return buffStr.c_str();         // DO NOT DO THIS - Example of bad code
}

But at least a mem checker will find their allocation and not yours

UPDATE 1: The lifetime of the memory that c_str() returns is the lifetime of the string that generated it, unless the string is changed, in which case the lifetime may end then. Basically, it is undefined behavior and may work sometimes, and not others. Your changes may have changed the observed behavior, but at no time was the user code conformant. Other factors could have broken it as well. Basically, it was always broken, from your description there seems no reason to apologize.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top