Why is '\x' invalid in Python?
-
01-10-2019 - |
Question
I was experimenting with '\' characters, using '\a\b\c...' just to enumerate for myself which characters Python interprets as control characters, and to what. Here's what I found:
\a - BELL
\b - BACKSPACE
\f - FORMFEED
\n - LINEFEED
\r - RETURN
\t - TAB
\v - VERTICAL TAB
Most of the other characters I tried, '\g', '\s', etc. just evaluate to the 2-character string of a backslash and the given character. I understand this is intentional, and makes sense to me.
But '\x' is a problem. When my script reaches this source line:
val = "\x"
I get:
ValueError: invalid \x escape
What is so special about '\x'? Why is it treated differently from the other non-escaped characters?
Solution
There is a table listing all the escape codes and their meanings in the documentation.
Escape Sequence Meaning Notes \xhh Character with hex value hh (4,5)
Notes:
4. Unlike in Standard C, exactly two hex digits are required.
5. In a string literal, hexadecimal and octal escapes denote the byte with the given value; it is not necessary that the byte encodes a character in the source character set. In a Unicode literal, these escapes denote a Unicode character with the given value.
OTHER TIPS
\xhh
is used to represent hex escape characters.
x is used to define (one byte) hexadecimal literals in strings, for example:
'\x61'
will evaluate to 'a', because 61 is the hexadecimal value of 97, which represents a in ASCII
\x is missing the hex character you want to match against: \xnn -> \x1B
You're not giving the full escape sequence:
\xhh...
The hexadecimal value hh, where hh stands for a sequence of hexadecimal digits (‘0’–‘9’, and either ‘A’–‘F’ or ‘a’–‘f’). Like the same construct in ISO C, the escape sequence continues until the first nonhexadecimal digit is seen. (c.e.) However, using more than two hexadecimal digits produces undefined results. (The ‘\x’ escape sequence is not allowed in POSIX awk.)
From: http://www.gnu.org/software/gawk/manual/html_node/Escape-Sequences.html