Parsing a YYYY-MM-DD date strictly on Linux
Question
POSIX defines a handy function, strptime
, that can be used for parsing dates and times. Thus, theoretically, if I have a date of the format "YYYY-MM-DD", I should be able to use strptime
to parse it like this:
char myDate[] = "2012-01-01";
struct tm result;
char *end = strptime(myDate, "%Y-%m-%d", &result);
… and get it back out in its canonical representation with:
if (end != NULL)
{
char outDate[11];
strftime(outDate, sizeof(outDate), "%Y-%m-%d", &result);
printf("%s\n", outDate);
}
else
printf("Invalid date\n");
On OS X and Linux, this prints out 2012-01-01. So far so good! However, let's say my input date is in the wrong format: 01-01-2012.
If I run the above code again, on OS X, I get "Invalid date", which is expected. However, on Linux, I get 1-01-20 — January 20th, 1 (yes, year one).
OS X follows the specifiers strictly, parsing a string as %Y
only where a four-digit year exists. Linux takes a few liberties, though, and interprets two digits as a year — it doesn't even appear that it assumes it's 2001, it treats it as year 1!
This can be worked around by changing my if
statement to something like
if (end != NULL && *end == '\0')
… but that seems hokey. Does anyone know if it's possible to make strptime
on Linux behave more strictly, and fail for %Y
if the input string does not have a four-digit year?
Solution
There is no option to change strptime()'s behavior or to specify a length within the format string. (I had a look into the sources in glibc-2.3, strptime discards format modifiers and field width, so something like "%04Y" wont change a thing).
But as was already pointed out in the comments, there is nothing wrong with if (end != NULL && *end == '\0')
. I'd recommend to use just that.
OTHER TIPS
I guess your best bet is to find the index of the first dash, and use one mask or the other according to its value.