Question

I am confused here, even though raw strings convert every \ to \\ but when this \ appears in the end it raises error.

>>> r'so\m\e \te\xt'
'so\\m\\e \\te\\xt'

>>> r'so\m\e \te\xt\'
SyntaxError: EOL while scanning string literal

Update:

This is now covered in Python FAQs as well: Why can’t raw strings (r-strings) end with a backslash?

Was it helpful?

Solution

You still need \ to escape ' or " in raw strings, since otherwise the python interpreter doesn't know where the string stops. In your example, you're escaping the closing '.

Otherwise:

r'it wouldn\'t be possible to store this string'
r'since it'd produce a syntax error without the escape'

Look at the syntax highlighting to see what I mean.

OTHER TIPS

Raw strings can't end in single backslashes because of how the parser works (there is no actual escaping going on, though). The workaround is to add the backslash as a non-raw string literal afterwards:

>>> print(r'foo\')
  File "<stdin>", line 1
    print(r'foo\')
                 ^
SyntaxError: EOL while scanning string literal
>>> print(r'foo''\\')
foo\

Not pretty, but it works. You can add plus to make it clearer what is happening, but it's not necessary:

>>> print(r'foo' + '\\')
foo\

Python strings are processed in two steps:

  1. First the tokenizer looks for the closing quote. It recognizes backslashes when it does this, but doesn't interpret them - it just looks for a sequence of string elements followed by the closing quote mark, where "string elements" are either (a character that's not a backslash, closing quote or a newline - except newlines are allowed in triple-quotes), or (a backslash, followed by any single character).

  2. Then the contents of the string are interpreted (backslash escapes are processed) depending on what kind of string it is. The r flag before a string literal only affects this step.

Quote from https://docs.python.org/3.4/reference/lexical_analysis.html#literals:

Even in a raw literal, quotes can be escaped with a backslash, but the backslash remains in the result; for example, r"\"" is a valid string literal consisting of two characters: a backslash and a double quote; r"\" is not a valid string literal (even a raw string cannot end in an odd number of backslashes). Specifically, a raw literal cannot end in a single backslash (since the backslash would escape the following quote character). Note also that a single backslash followed by a newline is interpreted as those two characters as part of the literal, not as a line continuation.

So in raw string, backslash are not treated specially, except when preceding " or '. Therefore, r'\' or r"\" is not a valid string cause right quote is escaped thus making the string literal invalid. In such case, there's no difference whether r exists, i.e. r'\' is equivalent to '\' and r"\" is equivalent to "\".

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top