Question

In the section covering setlocale, the ANSI C standard states in a footnote that the only ctype.h functions whose behaviour is not affected by the current locale are isdigit and isxdigit.

The Microsoft implementation of isdigit is locale dependent because, for example, in locales using code page 1250 isdigit only returns non-zero for characters in the range 0x30 ('0') - 0x39 ('9'), whereas in locales using code page 1252 isdigit also returns non-zero for the superscript digits 0xB2 ('²'), 0xB3 ('³') and 0xB9 ('¹').

Is Microsoft in violation of the C standard by making isdigit locale dependent?

In this question I am primarily interested in C90, which Microsoft claims to conform to, rather than C99.

Additional background:

Microsoft's own documentation of setlocale incorrectly states that isdigit is unaffected by the LC_CTYPE part of the locale.

The section of the C standard that covers the ctype.h functions contains some wording that I consider ambiguous:

The behavior of these functions is affected by the current locale. Those functions that have locale-specific aspects only when not in the "C" locale are noted below.

I consider this ambiguous because it is unclear what it is trying to say about functions such as isdigit for which there are no notes about locale-specific aspects. It might be trying to say that such functions must be assumed to be locale dependent, in which case Microsoft's implementation of isdigit would be OK. (Except that the footnote I mentioned earlier seems to contradict this interpretation.)

Was it helpful?

Solution

  1. Microsoft is always right.
  2. If Microsoft is not right see Item 1

Microsoft always has its own interpretation of the spec. And usually the sentence “but Microsoft is wrong” does not carry any weight with your CEO, so you have to code around MS bugs/interpretations.

The amount of code to support incorrect behavior of IE and Outlook is staggering.

In many cases, the only solution is to roll your own version of the function that does the right thing and do something like this:

int my_isdigit( int c )
{
#ifdef WIN32
  your implementation goes here
#else
  return isdigit( c );
#endif
}

OTHER TIPS

The required character set is defined in section 2.2.1. Section 2.2.1.2 then goes on to describe the behavior of extension characters:

  • The single-byte characters defined in $2.2.1 shall be present.
  • The presence, meaning, and representation of any additional members is locale-specific.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top