Regarding vsnprintf (Interview)
Question
During an interview I was requested (among other things) to implement the following function:
int StrPrintF(char **psz, const char *szFmt, ...);
similar to sprintf
, except instead of the already-allocated storage the function must allocate it itself, and return in the *psz
variable. Moreover, *psz
may point to an already-allocated string (on the heap), which may potentially be used during the formatting. Naturally this string must be free by the appropriate means.
The return value should be the length of the newly created string, or negative on error.
This is my implementation:
int StrPrintF(char **psz, const char *szFmt, ...)
{
va_list args;
int nLen;
va_start(args, szFmt);
if ((nLen = vsnprintf(NULL, 0, szFmt, args)) >= 0)
{
char *szRes = (char*) malloc(nLen + 1);
if (szRes)
if (vsnprintf(szRes, nLen + 1, szFmt, args) == nLen)
{
free(*psz);
*psz = szRes;
}
else
{
free(szRes);
nLen = -1;
}
else
nLen = -1;
}
va_end(args);
return nLen;
}
The question author claims there's a bug in this implementation. Not just a standard violation that may fail on particular esoteric systems, but a "real" bug, which by chance may fail on most systems.
It's also not related to usage of int
instead of memory-capability-suited type, such as size_t
or ptrdiff_t
. Say, the strings are of "reasonable" size.
I really have no clue of what the bug could be. All the pointer arithmetic is ok IMHO. I even don't assume that two consequent invocations of vsnprintf
produce the same result. All the variadic-handling stuff is also correct IMHO. va_copy
is not needed (it's the responsibility of the callee that uses va_list
). Also on x86 va_copy
and va_end
are meaningless.
I'll appreciate if someone can spot the (potential) bug.
EDIT:
After checking out the answers and comments - I'd like to add some notes:
- Naturally I've built and run the code with various inputs, including step-by-step in debugger, watching the variables state. I'd never ask for help without trying things myself first. I saw no sings of problems, no stack/heap corruption, etc. Also I've run it in debug build, with the debug heap enabled (which is intolerant to heap corruption).
- I assume that the function is called with valid parameters, i.e.
psz
is a valid pointer (not to confuse with*psz
),szFmt
is a valid format specifier, and all the variadic parameters are evaluated and correspond to the format string. - Calling
free
withNULL
pointer is ok according to the standard. - Calling
vsnprintf
is ok withNULL
pointer and size=0. It should return the resulting string length. MS-version, though not fully standard-compliant, does the same in this specific case. vsnprintf
won't exceed the specified buffer size, including the 0-terminator. Means - it does not always places it.- Please put the coding style aside (if you don't like it - fine with me).
Solution
va_copy is not needed (it's the responsibility of the callee that uses va_list)
Not quite right. I didn't find any such requirement for vsnprintf
in the C11 standard. It does say this in a footnote:
As the functions vfprintf, vfscanf, vprintf, vscanf, vsnprintf, vsprintf, and vsscanf invoke the va_arg macro, the value of arg after the return is indeterminate.
When you call vsnprintf
, the va_list
can be passed by value or by reference (it's an opaque type for all we know). So the first vsnprintf
can actually modify va_list
and ruin things for the second. The recommended approach is to make a copy using va_copy
.
And indeed, according to this article it doesn't happen that way on x86 but it does on x64.
OTHER TIPS
The first argument of vsnprintf should not be null according to:
http://msdn.microsoft.com/en-us/library/1kt27hek(v=vs.80).aspx
Edit 1: You should not free *psz if it is null!
The first call to vsnprintf() is really an attempt to get the length of the final string. However, it has a side effect! It moves the variable argument to the next one in the list as well. So, the next call to vsnprintf() does not have the first argument in the list captured. The easy hack is to reset the variable argument list to start again once you get the length from the first vsnprintf(). Maybe there's another way to do this better but, yeah, that's the issue.
Moreover, *psz may point to an already-allocated string (on the heap), which may potentially be used during the formatting.
For *psz
to be potentially reusable, some indication of whether it's garbage or a valid heap pointer is needed. Given no function argument indicating that, you can assume the only sane convention of a NULL sentinel value.... i.e. if *psz
is not NULL, then you can reuse it provided that the data you wish to format can fit into the same space. As the function is not given any indication of the amount of memory previous allocated, you can either:
- use realloc and trust it to avoid needless movement of the buffer
- infer a minimum pre-existing buffer size from strlen()
- this would mean that if you're say writing a long string then a short string then the original long string into the buffer, the last operation will needlessly replace the buffer.
Clearly realloc is a better bet.
int StrPrintF(char **psz, const char *szFmt, ...)
{
va_list args;
int nLen;
va_start(args, szFmt);
if ((nLen = vsnprintf(NULL, 0, szFmt, args)) >= 0)
{
char *szRes = (char*) realloc(psz, nLen + 1);
// ^ realloc does a fresh allocation is *psz == NULL
if (szRes)
vsnprintf(*psz = szRes, nLen + 1, szFmt, args); // can't fail
// ^ note the assignment....
else
nLen = -1;
}
va_end(args);
return nLen;
}
Note too - from a Linux manpage for printf()
- if your sprintf()
doesn't return a useful length you've got to get/write an implementation that does....
Concerning the return value of snprintf(), SUSv2 and C99 contradict each other: when snprintf() is called with size=0 then SUSv2 stipulates an unspecified return value less than 1, while C99 allows str to be NULL in this case, and gives the return value (as always) as the number of characters that would have been written in case the output string has been large enough.
Without giving you the answer outright: check your inputs.