Question

After learning that both strncmp is not what it seems to be and strlcpy not being available on my operating system (Linux), I figured I could try and write it myself.

I found a quote from Ulrich Drepper, the libc maintainer, who posted an alternative to strlcpy using mempcpy. I don't have mempcpy either, but it's behaviour was easy to replicate. First of, this is the testcase I have

#include <stdio.h>
#include <string.h>

#define BSIZE 10

void insp(const char* s, int n)
{
   int i;

   for (i = 0; i < n; i++)
      printf("%c  ", s[i]);

   printf("\n");

   for (i = 0; i < n; i++)
      printf("%02X ", s[i]);

   printf("\n");

   return;
}

int copy_string(char *dest, const char *src, int n)
{
   int r = strlen(memcpy(dest, src, n-1));
   dest[r] = 0;

   return r;
}

int main()
{
   char b[BSIZE];
   memset(b, 0, BSIZE);

   printf("Buffer size is %d", BSIZE);

   insp(b, BSIZE);

   printf("\nFirst copy:\n");
   copy_string(b, "First", BSIZE);
   insp(b, BSIZE);
   printf("b = '%s'\n", b);

   printf("\nSecond copy:\n");
   copy_string(b, "Second", BSIZE);
   insp(b, BSIZE);

   printf("b = '%s'\n", b);

   return 0;
}

And this is its result:

Buffer size is 10                    
00 00 00 00 00 00 00 00 00 00 

First copy:
F  i  r  s  t     b     =    
46 69 72 73 74 00 62 20 3D 00 
b = 'First'

Second copy:
S  e  c  o  n  d          
53 65 63 6F 6E 64 00 00 01 00 
b = 'Second'

You can see in the internal representation (the lines insp() created) that there's some noise mixed in, like the printf() format string in the inspection after the first copy, and a foreign 0x01 in the second copy.

The strings are copied intact and it correctly handles too long source strings (let's ignore the possible issue with passing 0 as length to copy_string for now, I'll fix that later).

But why are there foreign array contents (from the format string) inside my destination? It's as if the destination was actually RESIZED to match the new length.

Was it helpful?

Solution

The end of the string is marked by a \0 the memory after that can be anything, unless your OS deliberately blanks it then it's just whatever random junk was left there.

Note in this case the 'problem' isn't in the copy_string , you are exactly copying 10chars - but the memory after 'first' in your main code is just random.

OTHER TIPS

Because you are not stopping at the source size, you are stopping at the destiny size, which happens to be bigger than source, so you are copying the source string plus a bit of garbage past it.

You can easily see that you are copying your source string, with its null terminator. But since you are memcopying 10 bytes and both strings "First" and "Second" are shorter than 10 bytes, you are also copying the extra bytes past them.

The use of memcpy(dest, src, n-1) invokes undefined behavior if dest and src are not both at least n-1 in length.

For example, First\0 is six characters in length, but you read n-1 (9) characters from it; the contents of the memory past the end of the string literal are undefined, as is the behavior of your program when you read that memory.

The extra "stuff" is there because you've passed the buffer size to memcpy. It's going to copy that many characters, even when the source is shorter.

I'd do things a bit differently:

void copy_string(char *dest, char const *src, size_t n) { 
    *dest = '\0';
    strncat(dest, src, n);
}

Unlike strncpy, strncat is defined to work how most people would reasonably expect.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top