سؤال

I have some missunderstanding with encoding regexp:

>>> simplejson.dumps({'title':r'\d+'})
'{"title": "\\\\d+"}'
>>> simplejson.loads('{"title": "\\\\d+"}')
{u'title': u'\\d+'}
>>> print simplejson.loads('{"title": "\\\\d+"}')['title']
\d+

So, without using print I see \\, with using print I see \. So, what the value loaded dict contains - with \\ or with \?

هل كانت مفيدة؟

المحلول

Here is a trick: Use list to see what characters are really in the string:

In [3]: list(u'\\d+')
Out[3]: [u'\\', u'd', u'+']

list breaks up the string into individual characters. So u'\\' is one character. (The double backslash in u'\\' is an escape sequence.) It represents one backslash character. This is correct since r'\d+' also has only one backslash:

In [4]: list(r'\d+')
Out[4]: ['\\', 'd', '+']
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top