I was reading through Python's pprint module, and noticed that its safe_repr checks to see if "locale" in sys.modules before calling repr on a str value:

def _safe_repr(object, context, maxlevels, level):
    typ = type(object)
    if typ is str:
        if 'locale' not in _sys.modules: # <-------------------------------
            return repr(object), True, False
        if "'" in object and '"' not in object:
            closure = '"'
            quotes = {'"': '\\"'}
        else:
            closure = "'"
            quotes = {"'": "\\'"}
        qget = quotes.get
        sio = _StringIO()
        write = sio.write
        for char in object:
            if char.isalpha():
                write(char)
            else:
                write(qget(char, repr(char)[1:-1]))
        return ("%s%s%s" % (closure, sio.getvalue(), closure)), True, False

Source: https://github.com/python/cpython/blob/master/Lib/pprint.py#L315

In what scenario would a locale affect the value of repr(some_str)?

有帮助吗?

解决方案

See this thread and this one on the Python-dev mailing list. They describe an issue wherein users on systems where non-ASCII characters were printable (e.g., accented characters) wanted repr to retain those characters as-is, instead of showing them as escaped byte sequences.

I'm not actually sure this is the reason for the specific code you mention, but it seems plausible that it's related. What seems strange is that isalpha is also locale-dependent, so I'm not sure how (or if) it actually works.

其他提示

First off, I don't intend for this to be an answer since I don't think it is one. This was too big for a comment and there are a few things that I want to chip in to this post.

There are a couple reasons that it would check locale. The primary reason to be aware of, I believe, is that character encoding of the string is heavily dependent on the locale. The second thing is that when locale object is present, that can coerce the locale to format a string in a manner which is not consistent with what the pprint object method would like.

Another thing, I found, is that when running on a windows system. If I do the following:

>>> import sys
>>> 'locale' in sys.modules
True

However, when I run the same test on my GoDaddy shell account:

>>> import sys
>>> 'locale' in sys.modules
False

So, this could be a method to, among other things, quickly check what kind of operating system the user is running on, and then act accordingly.

Also, interestingly enough, I performed the following test on both a Linux system and on a Windows system:

>>>import locale
>>>locale.getdefaultlocale()

The Linux system returned:

 (None, None)

The Windows system returned:

('en_US', 'cp1252')

So, what I think is happening is that some of these characters might have a different kind of encoding and change the output of what we want repr() to show us. These locale-dependent effects on character encodings and alterations to strings, I think, would coerce a non-homogeneous output.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top