Question

During an interview I was requested (among other things) to implement the following function:

int StrPrintF(char **psz, const char *szFmt, ...);

similar to sprintf, except instead of the already-allocated storage the function must allocate it itself, and return in the *psz variable. Moreover, *psz may point to an already-allocated string (on the heap), which may potentially be used during the formatting. Naturally this string must be free by the appropriate means.

The return value should be the length of the newly created string, or negative on error.

This is my implementation:

int StrPrintF(char **psz, const char *szFmt, ...)
{
    va_list args;
    int nLen;

    va_start(args, szFmt);

    if ((nLen = vsnprintf(NULL, 0, szFmt, args)) >= 0)
    {
        char *szRes = (char*) malloc(nLen + 1);
        if (szRes)
            if (vsnprintf(szRes, nLen + 1, szFmt, args) == nLen)
            {
                free(*psz);
                *psz = szRes;
            }
            else
            {
                free(szRes);
                nLen = -1;
            }
        else
            nLen = -1;
    }

    va_end(args);
    return nLen;
}

The question author claims there's a bug in this implementation. Not just a standard violation that may fail on particular esoteric systems, but a "real" bug, which by chance may fail on most systems.

It's also not related to usage of int instead of memory-capability-suited type, such as size_t or ptrdiff_t. Say, the strings are of "reasonable" size.

I really have no clue of what the bug could be. All the pointer arithmetic is ok IMHO. I even don't assume that two consequent invocations of vsnprintf produce the same result. All the variadic-handling stuff is also correct IMHO. va_copy is not needed (it's the responsibility of the callee that uses va_list). Also on x86 va_copy and va_end are meaningless.

I'll appreciate if someone can spot the (potential) bug.

EDIT:

After checking out the answers and comments - I'd like to add some notes:

  • Naturally I've built and run the code with various inputs, including step-by-step in debugger, watching the variables state. I'd never ask for help without trying things myself first. I saw no sings of problems, no stack/heap corruption, etc. Also I've run it in debug build, with the debug heap enabled (which is intolerant to heap corruption).
  • I assume that the function is called with valid parameters, i.e. psz is a valid pointer (not to confuse with *psz), szFmt is a valid format specifier, and all the variadic parameters are evaluated and correspond to the format string.
  • Calling free with NULL pointer is ok according to the standard.
  • Calling vsnprintf is ok with NULL pointer and size=0. It should return the resulting string length. MS-version, though not fully standard-compliant, does the same in this specific case.
  • vsnprintf won't exceed the specified buffer size, including the 0-terminator. Means - it does not always places it.
  • Please put the coding style aside (if you don't like it - fine with me).
Was it helpful?

Solution

va_copy is not needed (it's the responsibility of the callee that uses va_list)

Not quite right. I didn't find any such requirement for vsnprintf in the C11 standard. It does say this in a footnote:

As the functions vfprintf, vfscanf, vprintf, vscanf, vsnprintf, vsprintf, and vsscanf invoke the va_arg macro, the value of arg after the return is indeterminate.

When you call vsnprintf, the va_list can be passed by value or by reference (it's an opaque type for all we know). So the first vsnprintf can actually modify va_list and ruin things for the second. The recommended approach is to make a copy using va_copy.

And indeed, according to this article it doesn't happen that way on x86 but it does on x64.

OTHER TIPS

The first argument of vsnprintf should not be null according to:

http://msdn.microsoft.com/en-us/library/1kt27hek(v=vs.80).aspx

Edit 1: You should not free *psz if it is null!

The first call to vsnprintf() is really an attempt to get the length of the final string. However, it has a side effect! It moves the variable argument to the next one in the list as well. So, the next call to vsnprintf() does not have the first argument in the list captured. The easy hack is to reset the variable argument list to start again once you get the length from the first vsnprintf(). Maybe there's another way to do this better but, yeah, that's the issue.

Moreover, *psz may point to an already-allocated string (on the heap), which may potentially be used during the formatting.

For *psz to be potentially reusable, some indication of whether it's garbage or a valid heap pointer is needed. Given no function argument indicating that, you can assume the only sane convention of a NULL sentinel value.... i.e. if *psz is not NULL, then you can reuse it provided that the data you wish to format can fit into the same space. As the function is not given any indication of the amount of memory previous allocated, you can either: - use realloc and trust it to avoid needless movement of the buffer - infer a minimum pre-existing buffer size from strlen() - this would mean that if you're say writing a long string then a short string then the original long string into the buffer, the last operation will needlessly replace the buffer.

Clearly realloc is a better bet.

int StrPrintF(char **psz, const char *szFmt, ...)
{
     va_list args;
     int nLen;
     va_start(args, szFmt);
     if ((nLen = vsnprintf(NULL, 0, szFmt, args)) >= 0)
     {
         char *szRes = (char*) realloc(psz, nLen + 1);
                             // ^ realloc does a fresh allocation is *psz == NULL
         if (szRes)
             vsnprintf(*psz = szRes, nLen + 1, szFmt, args); // can't fail
                       // ^ note the assignment....
         else
             nLen = -1;
     }
     va_end(args);
     return nLen;
} 

Note too - from a Linux manpage for printf() - if your sprintf() doesn't return a useful length you've got to get/write an implementation that does....

Concerning the return value of snprintf(), SUSv2 and C99 contradict each other: when snprintf() is called with size=0 then SUSv2 stipulates an unspecified return value less than 1, while C99 allows str to be NULL in this case, and gives the return value (as always) as the number of characters that would have been written in case the output string has been large enough.

Without giving you the answer outright: check your inputs.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top