Chars stored with psycopg2 in Pg displayed as code-points not characters

Question

It isn't clear that you've confirmed whether the characters are stored incorrectly, or retrieved and displayed incorrectly. Nor is it clear whether the problem is in working with them in PostgreSQL, or before that in Python.

In this case, "å" is unicode code-point U+00E5, encoded in utf-16 BE as 0x00E5 or in utf-8 as 0xc3 0xa5 . That matches what you're seeing - a utf-8 byte sequence representation of "å" - so I suspect your terminal is misconfigured and just can't display it, or is trying to interpret it as latin-1 and doesn't have the right characters in the font for the resulting mangled text:

>>> print u'å'.encode("utf-8").decode("latin-1")
Ã¥

so it's showing the codepoints instead.

It doesn't help that your Python code is nonsense:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'str' object has no attribute 'encoding'

I think you meant "encode". It's unnecessary to do this anyway; Python's psycopg2 happily works with unicode string objects directly:

>>> conn = psycopg2.connect("dbname=regress")
>>> curs = conn.cursor()
>>> curs.execute("SELECT %s", (u'áéíóú',));
>>> print curs.fetchone()[0]
áéíóú

With encoding problems you need to trace things through, step by step, to determine where the text encoding is being mishandled.

There's nowhere near enough information to answer a question like this. All I can really offer is general advice. At every step, confirm that you respect the encoding of the input, and that the output from one step is in the same encoding the next step expects as input.

First, you need to make sure that your unicode strings are correct in Python. print repr(mystring) will be useful for this, to see the string data. Then you should stop explicitly encoding them when passing them to psycopg2; just let psycopg2 deal with it.

Next step will be to examine them in the database using psql. Even if they don't display correctly on your terminal you can check if they're right in the database with the convert_to function, which takes a database field or string literal as input and outputs the byte sequence in the desired encoding. So, eg:

SELECT convert_to(column1, 'utf-8') FROM mytable;

and make sure that the hex byte sequence returned matches what it should be for the utf-8 encoding of the text you sent.

Continue with this process. At each step, examine the string bytes to make sure they match what they should be, until you find the stage that's mishandling the text.

I assure you that neither PostgreSQL nor psycopg2 do store Unicode characters incorrectly. In this case it could be as simple an issue as your terminal being set up wrong, or it could be that something in the text handling path is using the wrong input encoding, so you encode something as utf-8 then decode it as latin-1 (for example).