Question

From Chapter 2(Sub section 2.3 named Constants) of K&R book on C programming language:

Certain characters can be represented in character and string constants by escape sequences like \n (newline); these sequences look like two characters, but represent only one. In addition, an arbitrary byte-sized bit pattern can be specified by

′\ooo′

where ooo is one to three octal digits (0...7) or by

′\xhh′

where hh is one or more hexadecimal digits (0...9, a...f, A...F). So we might write

#define VTAB ′\013′    /* ASCII vertical tab */
#define BELL ′\007′    /* ASCII bell character */

or, in hexadecimal,
#define VTAB ′\xb′     /* ASCII vertical tab */
#define BELL ′\x7′     /* ASCII bell character */

The part that confuses me is the following wordings(emphasis mine): where ooo is one to three octal digits (0...7). If there are three octal digits the the number of bits required will be 9(3 for each digit) which exceeds the byte length required for characters. Surely I am missing something here. What is it that I am missing?

Was it helpful?

Solution

\ooo (3 octal digits) does indeed allow a specification of 9-bit values of 0 to 111111111 (binary) or 511. If this is allowed is dependent on the char size.

Assignments such as below generate a warning on many environments because a char is 8 bits in those environments. Typically the highest octal sequence allowed is \377. But a char needs not be 8 bits. OP's "9... exceeds the byte length required for characters" is incorrect.

char *s = "\777";  //warning "Octal sequence out of range"
char c  = '\777';  //warning
int i   = '\777';  //warning

The 3 octal digit constant '\141' is the same as 'a' in a typically environment where ASCII is used. But in an alternate character set, 'a' could be different. Thus if one wanted a portable bit pattern assignment of 01100001, one could use '\141' instead of 'a'. One could accomplish the same by assigning '\x61'. In some context, an octal pattern may be preferred.

C11 6.4.4.4.9 If no prefix used, "The value of an octal or hexadecimal escape sequence shall be in the range of representable values for the corresponding type: unsigned char"

OTHER TIPS

The range of code numbers of characters is not defined in K&R, as far as I can remember. In the early days, it was usually the ASCII range 0...127. Nowadays it is often an 8-bit range, 0...255, but it could be wider, too. In any case, the implementation-defined limits on the char data type imply restrictions on the escape notations, too.

For example, if the range is 0...127, then \177 is the largest allowed octal escape.

The first octal digit is only allowed to go to 3 (two bits), not 7 (three bits), if we're talking about eight bit bytes. If we're talking about ASCII (7 bit values), the first digit can only be zero or one.

If K&R says otherwise, their description is either incomplete or incorrect.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top