Question

When should I use std::string and when should I use char* to manage arrays of chars in C++?

It seems you should use char* if performance(speed) is crucial and you're willing to accept some of a risky business because of the memory management.

Are there other scenarios to consider?

Was it helpful?

Solution

You can pass std::strings by reference if they are large to avoid copying, or a pointer to the instance, so I don't see any real advantage using char pointers.

I use std::string/wstring for more or less everything that is actual text. char * is useful for other types of data though and you can be sure it gets deallocated like it should. Otherwise std::vector is the way to go.

There are probably exceptions to all of this.

OTHER TIPS

My point of view is:

  • Never use char * if you don't call "C" code.
  • Always use std::string: It's easier, it's more friendly, it's optimized, it's standard, it will prevent you from having bugs, it's been checked and proven to work.

Raw string usage

Yes, sometimes you really can do this. When using const char *, char arrays allocated on the stack and string literals you can do it in such a way there is no memory allocation at all.

Writing such code requires often more thinking and care than using string or vector, but with a proper techniques it can be done. With proper techniques the code can be safe, but you always need to make sure when copying into char [] you either have some guarantees on the lenght of the string being copied, or you check and handle oversized strings gracefully. Not doing so is what gave the strcpy family of functions the reputation of being unsafe.

How templates can help writing safe char buffers

As for char [] buffers safety, templates can help, as they can create an encapsulation for handling the buffer size for you. Templates like this are implemented e.g. by Microsoft to provide safe replacements for strcpy. The example here is extracted from my own code, the real code has a lot more methods, but this should be enough to convey the basic idea:

template <int Size>
class BString
{
  char _data[Size];

  public:
  BString()
  {
    _data[0]=0;
    // note: last character will always stay zero
    // if not, overflow occurred
    // all constructors should contain last element initialization
    // so that it can be verified during destruction
    _data[Size-1]=0;
  }
  const BString &operator = (const char *src)
  {
    strncpy(_data,src,Size-1);
    return *this;
  }

  operator const char *() const {return _data;}
};

//! overloads that make conversion of C code easier 
template <int Size>
inline const BString<Size> & strcpy(BString<Size> &dst, const char *src)
{
  return dst = src;
}

One occasion that you MUST use char* and not std::string is when you need static string constants. The reason for that is that you don't have any control on the order modules initialize their static variables, and another global object from a different module may refer to your string before it's initialized. http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Static_and_Global_Variables

std::string pros:

  • manages the memory for you (the string can grow, and the implementation will allocate a larger buffer you)
  • Higher-level programming interface, works nicely with the rest of STL.

std::string cons: - two distinct STL string instances can not share the same underlying buffer. So if you pass by value you always get a new copy. - there is some performance penalty, but I'd say unless your requirements are special it's negligible.

You should consider to use char* in the following cases:

  • This array will be passed in parameter.
  • You know in advance the maximum size of your array (you know it OR you impose it).
  • You will not do any transformation on this array.

Actually, in C++, char* are often use for fixed small word, as options, file name, etc...

When to use a c++ std::string:

  • strings, overall, are more secure than char*, Normally when you are doing things with char* you have to check things to make sure things are right, in the string class all this is done for you.
  • Usually when using char*, you will have to free the memory you allocated, you don't have to do that with string since it will free its internal buffer when destructed.
  • Strings work well with c++ stringstream, formatted IO is very easy.

When to use char*

  • Using char* gives you more control over what is happening "behind" the scenes, which means you can tune the performance if you need to.

Use (const) char* as parameters if you are writing a library. std::string implementations differ between different compilers.

If you want to use C libraries, you'll have to deal with C-strings. Same applies if you want to expose your API to C.

You can expect most operations on a std::string (such as e.g. find) to be as optimized as possible, so they're likely to perform at least as well as a pure C counterpart.

It's also worth noting that std::string iterators quite often map to pointers into the underlying char array. So any algorithm you devise on top of iterators is essentially identical to the same algorithm on top of char * in terms of performance.

Things to watch out for are e.g. operator[] - most STL implementations do not perform bounds checking, and should translate this to the same operation on the underlying character array. AFAIK STLPort can optionally perform bounds checking, at which point this operator would be a little bit slower.

So what does using std::string gain you? It absolves you from manual memory management; resizing the array becomes easier, and you generally have to think less about freeing memory.

If you're worried about performance when resizing a string, there's a reserve function that you may find useful.

if you are using the array of chars in like text etc. use std::string more flexible and easier to use. If you use it for something else like data storage? use arrays (prefer vectors)

Even when performance is crucial you better use vector<char> - it allows memory allocation in advance (reserve() method) and will help you avoid memory leaks. Using vector::operator[] leads to an overhead, but you can always extract the address of the buffer and index it exactly like if it was a char*.

AFAIK internally most std::string implement copy on write, reference counted semantics to avoid overhead, even if strings are not passed by reference.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top