The basic execution character set is guaranteed to be nonnegative - the precise wording in C99 is:
If a member of the basic execution character set is stored in a
char
object, its value is guaranteed to be nonnegative.
Domanda
Most C compilers use signed characters. Most C libraries define EOF as -1.
Despite being a long-time C programmer I had never before put these two facts together and so in the interest of robust and international software I would ask for a bit of help in spelling out the implications.
Here is what I have discovered thus far:
getchar() == (unsigned char) 'µ'
.<ctype.h>
functions are designed to handle EOF and expected unsigned characters. Any other negative input may cause out-of-bounds addressing.Is this assessment correct and if so what other gotchas did I miss?
Full disclosure: I ran into an out-of-bounds indexing bug today when feeding non-ASCII characters to isspace() and the realization of the amount of lurking bugs in my old code both scared and annoyed me. Hence this frustrated question.
Soluzione
The basic execution character set is guaranteed to be nonnegative - the precise wording in C99 is:
If a member of the basic execution character set is stored in a
char
object, its value is guaranteed to be nonnegative.