Question

I am working on an older MFC/C++ project that parses large text files using MFC's CString class to handle strings. I noticed that during the parsing process there's a lot of adding of small parts to an overall large CString object as such:

//'strContainer' = CString
//'tag' = CString of a much smaller size
strContainer += L"<" + tag + L">";

The operator above seems to be slowing the overall performance of CString when strContainer variable reaches a certain larger size. I'm supposing that such happens because of the often re-allocation of memory done by the += operator.

So I was curious, is there any way to improve this?

PS1. I do not know the size of the result string up front to pre-allocate it.

PS2. I have to stick with CString due to complexity of the project itself. (Or, I can't switch to Boost or other newer implementations.)

Was it helpful?

Solution

With std::string, += is usually quite fast as it can just copy bytes into already allocated buffers. Usually, the L"<" + tag + L">"; will require three or more memory allocations, which are completely unnecessary, if you simply replace that line of code with three +=. Additionally, allocations are REALLY REALLY SLOW if you have Visual Studio start the program for you, even release builds. Run your program manually without Visual Studio, and see if that solves your performance problems.

I dug into the MFC source to very this. (And dug and dug and dug...) and found that ATL::CSimpleStringT::PrepareWrite2(int nLength) will grow exponentially (1.5x bigger each allocation, completely normal, std::string is the same, except...
If the MFC string is over 1G, it only adds 1M each allocation after that.

So there's two conditions: If strContainer is over 1G, you should manually reserve memory (Preallocate a large number of bytes. It doesn't have to be exact, or even greater than the real number.).
Otherwise, simply replace the + with +=.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top