Question

Is there something like startsWith(str_a, str_b) in the standard C library?

It should take pointers to two strings that end with nullbytes, and tell me whether the first one also appears completely at the beginning of the second one.

Examples:

"abc", "abcdef" -> true
"abcdef", "abc" -> false
"abd", "abdcef" -> true
"abc", "abc"    -> true
Was it helpful?

Solution

Apparently there's no standard C function for this. So:

bool startsWith(const char *pre, const char *str)
{
    size_t lenpre = strlen(pre),
           lenstr = strlen(str);
    return lenstr < lenpre ? false : memcmp(pre, str, lenpre) == 0;
}

Note that the above is nice and clear, but if you're doing it in a tight loop or working with very large strings, it may not offer the best performance, as it scans the full length of both strings up front (strlen). Solutions like wj32's or Christoph's may offer better performance (although this comment about vectorization is beyond my ken of C). Also note Fred Foo's solution which avoids strlen on str (he's right, it's unnecessary if you use strncmp instead of memcmp). Only matters for (very) large strings or repeated use in tight loops, but when it matters, it matters.

OTHER TIPS

There's no standard function for this, but you can define

bool prefix(const char *pre, const char *str)
{
    return strncmp(pre, str, strlen(pre)) == 0;
}

We don't have to worry about str being shorter than pre because according to the C standard (7.21.4.4/2):

The strncmp function compares not more than n characters (characters that follow a null character are not compared) from the array pointed to by s1 to the array pointed to by s2."

I'd probably go with strncmp(), but just for fun a raw implementation:

_Bool starts_with(const char *restrict string, const char *restrict prefix)
{
    while(*prefix)
    {
        if(*prefix++ != *string++)
            return 0;
    }

    return 1;
}

I'm no expert at writing elegant code, but...

int prefix(const char *pre, const char *str)
{
    char cp;
    char cs;

    if (!*pre)
        return 1;

    while ((cp = *pre++) && (cs = *str++))
    {
        if (cp != cs)
            return 0;
    }

    if (!cs)
        return 0;

    return 1;
}

Use strstr() function. Stra == strstr(stra, strb)

Optimized (v.2. - corrected):

uint32 startsWith( const void* prefix_, const void* str_ ) {
    uint8 _cp, _cs;
    const uint8* _pr = (uint8*) prefix_;
    const uint8* _str = (uint8*) str_;
    while ( ( _cs = *_str++ ) & ( _cp = *_pr++ ) ) {
        if ( _cp != _cs ) return 0;
    }
    return !_cp;
}

Because I ran the accepted version and had a problem with a very long str, I had to add in the following logic:

bool longEnough(const char *str, int min_length) {
    int length = 0;
    while (str[length] && length < min_length)
        length++;
    if (length == min_length)
        return true;
    return false;
}

bool startsWith(const char *pre, const char *str) {
    size_t lenpre = strlen(pre);
    return longEnough(str, lenpre) ? strncmp(str, pre, lenpre) == 0 : false;
}

Or a combination of the two approaches:

_Bool starts_with(const char *restrict string, const char *restrict prefix)
{
    char * const restrict prefix_end = prefix + 13;
    while (1)
    {
        if ( 0 == *prefix  )
            return 1;   
        if ( *prefix++ != *string++)
            return 0;
        if ( prefix_end <= prefix  )
            return 0 == strncmp(prefix, string, strlen(prefix));
    }  
}

EDIT: The code below does NOT work because if strncmp returns 0 it is not known if a terminating 0 or the length (block_size) was reached.

An additional idea is to compare block-wise. If the block is not equal compare that block with the original function:

_Bool starts_with_big(const char *restrict string, const char *restrict prefix)
{
    size_t block_size = 64;
    while (1)
    {
        if ( 0 != strncmp( string, prefix, block_size ) )
          return starts_with( string, prefix);
        string += block_size;
        prefix += block_size;
        if ( block_size < 4096 )
          block_size *= 2;
    }
}

The constants 13, 64, 4096, as well as the exponentiation of the block_size are just guesses. It would have to be selected for the used input data and hardware.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top