python: extended ASCII codes

Question 1

When you print a list, it outputs the default representation of all its elements - ie by calling repr() on each of them. The repr() of a string is its escaped code, by design. If you want to output all the elements of the list properly you should convert it to a string, eg via ', '.join(li).

Note that as those in the comments have stated, there isn't really any such thing as "extended ASCII", there are just various different encodings.

Question 2

You probably want the charmap encoding, which lets you turn unicode into bytes without 'magic' conversions.

s='\xf7'
b=s.encode('charmap')
with open('/dev/stdout','wb') as f:
    f.write(b)
    f.flush()

Will print ÷ on my system.

Note that 'extended ASCII' refers to any of a number of proprietary extensions to ASCII, none of which were ever officially adopted and all of which are incompatible with each other. As a result, the symbol output by that code will vary based on the controlling terminal's choice of how to interpret it.

Question 3

There's no single defined standard named "extend ASCII Codes"> - there are however, plenty of characters, tens of thousands, as defined in the Unicode standards.

You can be limited to the charset encoding of your text terminal, which you may think of as "Extend ASCII", but which might be "latin-1", for example (if you are on a Unix system such as Linux or Mac OS X, your text terminal will likely use UTF-8 encoding, and able to display any of the tens of thousands chars available in Unicode)

So, you must read this piece in order to understand what text is, after 1992 - If you try to do any production application believing in "extended ASCII" you are harming yourself, your users and the whole eco-system at once: http://www.joelonsoftware.com/articles/Unicode.html

That said, Python2's (and Python3's) print will call the an implicit str conversion for the objects passed in. If you use a list, this conversion does not recursively calls str for each list element, instead, it uses the element's repr, which displays non ASCII characters as their numeric representation or other unsuitable notations.

You can simply join your desired characters in a unicode string, for example, and then print them normally, using the terminal encoding:

import sys

mytext = u""
mytext += unichr(247) #check the codes for unicode chars here:  http://en.wikipedia.org/wiki/List_of_Unicode_characters

print mytext.encode(sys.stdout.encoding, errors="replace")

Question 4

You are doing nothing wrong.

What you do is to add a string of length 1 to a list.

This string contains a character outside the range of printable characters, and outside of ASCII (which is only 7 bit). That's why its representation looks like '\xf7'.

If you print it, it will be transformed as good as the system can.

In Python 2, the byte will be just printed. The resulting output may be the division symbol, or any other thing, according to what your system's encoding is.

In Python 3, it is a unicode character and will be processed according to how stdout is set up. Normally, this indeed should be the division symbol.

In a representation of a list, the __repr__() of the string is called, leading to what you see.