How do I use extended characters in Python's curses library?
Question
I've been reading tutorials about Curses programming in Python, and many refer to an ability to use extended characters, such as line-drawing symbols. They're characters > 255, and the curses library knows how to display them in the current terminal font.
Some of the tutorials say you use it like this:
c = ACS_ULCORNER
...and some say you use it like this:
c = curses.ACS_ULCORNER
(That's supposed to be the upper-left corner of a box, like an L flipped vertically)
Anyway, regardless of which method I use, the name is not defined and the program thus fails. I tried "import curses" and "from curses import *", and neither works.
Curses' window() function makes use of these characters, so I even tried poking around on my box for the source to see how it does it, but I can't find it anywhere.
Solution
From curses/__init__.py
:
Some constants, most notably the
ACS_*
ones, are only added to the C_curses
module's dictionary afterinitscr()
is called. (Some versions of SGI's curses don't define values for those constants untilinitscr()
has been called.) This wrapper function calls the underlying Cinitscr()
, and then copies the constants from the_curses
module to the curses package's dictionary. Don't do 'from curses import *
' if you'll be needing theACS_*
constants.
In other words:
>>> import curses
>>> curses.ACS_ULCORNER
exception
>>> curses.initscr()
>>> curses.ACS_ULCORNER
>>> 4194412
OTHER TIPS
I believe the below is appropriately related, to be posted under this question. Here I'll be using utfinfo.pl (see also on Super User).
First of all, for standard ASCII character set, the Unicode code point and the byte encoding is the same:
$ echo 'a' | perl utfinfo.pl
Char: 'a' u: 97 [0x0061] b: 97 [0x61] n: LATIN SMALL LETTER A [Basic Latin]
So we can do in Python's curses
:
window.addch('a')
window.border('a')
... and it works as intended
However, if a character is above basic ASCII, then there are differences, which addch
docs don't necessarily make explicit. First, I can do:
window.addch(curses.ACS_PI)
window.border(curses.ACS_PI)
... in which case, in my gnome-terminal
, the Unicode character 'π' is rendered. However, if you inspect ACS_PI
, you'll see it's an integer number, with a value of 4194427 (0x40007b); so the following will also render the same character (or rater, glyph?) 'π':
window.addch(0x40007b)
window.border(0x40007b)
To see what's going on, I grepped through the ncurses
source, and found the following:
#define ACS_PI NCURSES_ACS('{') /* Pi */
#define NCURSES_ACS(c) (acs_map[NCURSES_CAST(unsigned char,c)])
#define NCURSES_CAST(type,value) static_cast<type>(value)
#lib_acs.c: NCURSES_EXPORT_VAR(chtype *) _nc_acs_map(void): MyBuffer = typeCalloc(chtype, ACS_LEN);
#define typeCalloc(type,elts) (type *)calloc((elts),sizeof(type))
#./widechar/lib_wacs.c: { '{', { '*', 0x03c0 }}, /* greek pi */
Note here:
$ echo '{π' | perl utfinfo.pl
Got 2 uchars
Char: '{' u: 123 [0x007B] b: 123 [0x7B] n: LEFT CURLY BRACKET [Basic Latin]
Char: 'π' u: 960 [0x03C0] b: 207,128 [0xCF,0x80] n: GREEK SMALL LETTER PI [Greek and Coptic]
... neither of which relates to the value of 4194427 (0x40007b) for ACS_PI
.
Thus, when addch
and/or border
see a character above ASCII (basically an unsigned int
, as opposed to unsigned char
), they (at least in this instance) use that number not as Unicode code point, or as UTF-8 encoded bytes representation - but instead, they use it as a look-up index for acs_map
-ping function (which ultimately, however, would return the Unicode code point, even if it emulates VT-100). That is why the following specification:
window.addch('π')
window.border('π')
will fail in Python 2.7 with argument 1 or 3 must be a ch or an int
; and in Python 3.2 would render simply a space instead of a character. When we specify 'π'
. we've actually specified the UTF-8 encoding [0xCF,0x80] - but even if we specify the Unicode code point:
window.addch(0x03C0)
window.border0x03C0)
... it simply renders nothing (space) in both Python 2.7 and 3.2.
That being said - the function addstr
does accept UTF-8 encoded strings, and works fine:
window.addstr('π')
... but for borders - since border()
apparently handles characters in the same way addch()
does - we're apparently out of luck, for anything not explicitly specified as an ACS
constant (and there's not that many of them, either).
Hope this helps someone,
Cheers!
you have to set your local to all, then encode your output as utf-8 as follows:
import curses
import locale
locale.setlocale(locale.LC_ALL, '') # set your locale
scr = curses.initscr()
scr.clear()
scr.addstr(0, 0, u'\u3042'.encode('utf-8'))
scr.refresh()
# here implement simple code to wait for user input to quit
scr.endwin()
output: あ