Avoiding the CRT

https://stackoverflow.com/questions/10724690

10-06-2021
|

Question

When writing a C++ application, I normally limit myself to C++ specific language features. Mostly this means using STL instead of CRT where ever possible.

To me, STL is just so much more fluid and maintainable than using CRT. Consider the following:

std::string str( "Hello" );
if( str == "Hello" ) { ... }

The C-Runtime equivalent would be:

char const* str = "Hello";
if( strcmp( str, "Hello" ) == 0 ) { ... }

Personally I find the former example much easier to look at. It's just more clear to me what's going on. When I write a first pass of my code, the first thing on my mine is always to write code in the most natural way.

One concern my team has with the former example is the dynamic allocation. If the string is static OR has already been allocated elsewhere, they argue it doesn't make sense to potentially cause fragmentation or have a wasteful allocation here. My argument against this is to write code in the most natural way first, and then go back and change it after getting proof that the code causes a problem.

Another reason I don't like the latter example is that it uses the C Library. Typically I avoid it at all costs simply because it's not C++, it's less readable, and more error prone and is more of a security risk.

So my question is, am I right to avoid it the C Runtime? Should I really care about the extra allocation at this step in coding? It's hard for me to tell if I'm right or wrong in this scenario.

Solution

I feel like my comment about llvm::StringRef went ignored, so I'll make an answer out of it.

llvm::StringRef str("Hello");

This essentially sets a pointer, calls strlen, then sets another pointer. No allocation.

if (str == "Hello") { do_something(); }

Readable, and still no allocation. It also works with std::string.

std::string str("Hello");
llvm::StringRef stref(str);

You have to be careful with that though, because if the string is destroyed or re-allocated, the StringRef becomes invalid.

if (str == stref) { do_something(); }

I have noticed quite substantial performance benefits when using this class in appropriate places. It's a powerful tool, you just need to be careful with it. I find that it is most useful with string literals, since they are guaranteed to last for the lifetime of the program. Another cool feature is that you can get substrings without creating a new string.

As an aside, there is a proposal to add a class similar to this to the standard library.

OTHER TIPS

Are you doing C++ or C? Those are completely different languages with completely different ways of thinking.

If C++:

std::string str( "Hello" );
if( str == "Hello" ) { ... }

If C:

char const* str = "Hello";
if( strcmp( str, "Hello" ) == 0 ) { ... }

Don't mix both.

Using a compiler that implements the Small String Optimization, I get this result:

main    PROC                        ; COMDAT

; 6    : {

$LN124:
  00000 48 83 ec 48       sub    rsp, 72            ; 00000048H

; 7    :    std::string str( "Hello" );

  00004 8b 05 00 00 00
        00                mov    eax, DWORD PTR ??_C@_05COLMCDPH@Hello?$AA@

; 8    : 
; 9    :    if( str == "Hello" )

  0000a 48 8d 15 00 00
        00 00            lea     rdx, OFFSET FLAT:??_C@_05COLMCDPH@Hello?$AA@
  00011 48 8d 4c 24 20   lea     rcx, QWORD PTR str$[rsp]
  00016 89 44 24 20      mov     DWORD PTR str$[rsp], eax
  0001a 0f b6 05 04 00
        00 00            movzx   eax, BYTE PTR ??_C@_05COLMCDPH@Hello?$AA@+4
  00021 41 b8 05 00 00
        00               mov     r8d, 5
  00027 c6 44 24 37 00   mov     BYTE PTR str$[rsp+23], 0
  0002c 48 c7 44 24 38
        05 00 00 00      mov     QWORD PTR str$[rsp+24], 5
  00035 c6 44 24 25 00   mov     BYTE PTR str$[rsp+5], 0
  0003a 88 44 24 24      mov     BYTE PTR str$[rsp+4], al
  0003e e8 00 00 00 00   call    memcmp
  00043 85 c0            test    eax, eax
  00045 75 1d            jne     SHORT $LN123@main

; 10   :    { printf("Yes!\n"); }

  00047 48 8d 0d 00 00
        00 00            lea     rcx, OFFSET FLAT:??_C@_05IOIEDEHB@Yes?$CB?6?$AA@
  0004e e8 00 00 00 00   call    printf

; 11   : 
; 12   : }

Not a single memory allocation in sight!

Under the hood, std::string::operator== is ostensibly calling strcmp. Honestly, if fragmentation isn't an issue for you and you like to leverage the stl's more readable syntax, go for it and use the stl. If performance is an issue and you profile the code and you see that constant allocation/deallocation of std::string internal data is a hotspot/bottleneck, optimize there. If you don't like inconsistent coding style mixing operator==() and strcmp, write something like this:

inline bool str_eq(const char* const lhs, const char* const rhs)
{
    return strcmp(lhs, rhs) == 0;
}
inline bool str_eq(const std::string& lhs, const char* const rhs)
{
    return str_eq(lhs.c_str(), rhs);
}
inline bool str_eq(const char* const lhs, const std::string& rhs)
{
    return str_eq(lhs, rhs.c_str());
}
inline bool str_eq(const std::string& lhs, const std::string& rhs)
{
    return lhs == rhs;
}

This shouldn't really be a religious conversation. Both work the same. Now if you see somebody writing

std::string str( "Hello" );
if( strcmp(str.c_str(), "Hello") == 0 ) { ... }

std::string str( "Hello" );
if( str.compare( "Hello" ) == 0) { ... }

then you can have a debate on mixing styles because both of those obviously would been clearer using operator==

If your team coding in C++, you should use all features it offers. Of course, C++ properly used takes care about memory allocation (constructors and destructors) and more natural syntax (for ==, +).

You may think OOP style may be slower. But you must measure first that bottleneck is string operations. It's unlikely for most scenarios. Premature optimization is root of all evil. Properly designed C++ classes will not lose to handy written C code.

Returning back to your question, worst variant to mix libraries. You may replace C string with OOP library, but still using old-school IO routines and maths.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow