Question

Can anyone please explain the below code and please also explain the role of backslash( \ ) in such situations. And what \' , \" , \ooo , \ \ , \? means?

#include <stdio.h>
int main(){
    char a = '\010';
    char y = '010';
    printf("%d,%d",a,y);

    return 0;
}

output: 8,48

Was it helpful?

Solution

This '\010' is a octal escape sequence 10 in octal is 8 in decimal and it will be promoted to an int when calling printf so that explains that value.

This '010' is a multi-character constant and it's value is implementation defined, if we look at the C99 draft standard section 6.4.4.4 Character constants paragraph 10 says(emphasis mine):

[...]The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined.[...]

and if you were using gcc you would have seen at least this warning:

warning: multi-character character constant [-Wmultichar]

and probably this warning as well on overflow:

warning: overflow in implicit constant conversion [-Woverflow]

the value that y obtains is a little more interesting since character constant has an integer value it can not just be taking the first character, the multi-character constant has to take an integer value and then be converted to char. clang helpfully provides a more detailed warning:

warning: implicit conversion from 'int' to 'char' changes value from 3158320 to 48 [-Wconstant-conversion]

and current versions of gcc produces the same value, as we can see from this simple piece of code:

printf("%d\n",'010');

so where does 3158320 comes from? For gcc at least, if we look at the documentation for Implementation-defined behavior it says:

The compiler evaluates a multi-character character constant a character at a time, shifting the previous value left by the number of bits per target character, and then or-ing in the bit-pattern of the new character truncated to the width of a target character. The final bit-pattern is given type int, and is therefore signed, regardless of whether single characters are signed or not (a slight change from versions 3.1 and earlier of GCC). If there are more characters in the constant than would fit in the target int the compiler issues a warning, and the excess leading characters are ignored.

if we perform the operation(assuming 8-bit char) document above we see:

 48*2^16 + 49*2^8 + 48  = 3158320
 ^         ^
 |         decimal value of ASCII '1'
 decimal value of ASCII '0'

gcc will convert the int to char using modulus 2^8 regardless of whether char is signed or unsigned which effectively leaves us with the last 8 bits or 48.

OTHER TIPS

It is an escape sequence to remove meaning of some reserved character such as ' or to specify some special character such as new-line '\n' or in this case a character with specific ASCII value:

char a = '\010';

defines a character with octal ASCII value 108, i.e. decimal value 810.

char y = '010';

defines a multi-byte character, which should be assigned to wide char, not char. Although the behavior of this assignment is not defined, in this case it will most likely cause the last character being stored, in y

In the first case, \010 is interpreted as an octal which results in 8.

In the second case, the ascii value of 0 (the first character in 010) is returned, which is 48.

If you had compiler warnings on, chances are that you'd have figured the second one

char y = '010';

by yourself. (gcc would have emitted -Wmultichar and -Woverflow in this case.)


Quoting C1X draft, section 6.4.4.4:

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer. The value of an integer character constant containing more than one character (e.g., 'ab'), or containing a character or escape sequence that does not map to a single-byte execution character, is implementation-defined. If an integer character constant contains a single character or escape sequence, its value is the one that results when an object with type char whose value is that of the single character or escape sequence is converted to type int.

Hope this helps:

\a  Beep
\b  Backspace
\f  Formfeed
\n  New line
\r  Carriage return
\t  Horizontal tab
\v  Vertical tab
\\  Backslash
\'  Single quotation mark
\"  Double quotation mark
\0  ASCII 0x00 (nul terminator)
\ooo    Octal representation
\xdd    Hexadecimal representation
\xnn    Hexadecimal character code nn
\onn    Octal character code nn
\nn     Octal character code nn

Escape sequences are character combinations that comprise a backslash (\) followed by some character. When you're programming in almost all languages, sometimes you need to refer to a key press that doesn't result in a specific character.

For e.g. suppose you develop a language that assumes a variable to be enclosed with Box brackets [var] - then if you are inputting a value other than a variable then you will have to escape it like \[10\]. Otherwise your compiler will think that 10 is a variable name.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top