Вопрос

I'm trying to work with orgnode.py (from here) to parse org files. These files are English/Persian and using file -i it seems they are utf-8 encoded. But I recieve this error when use makelist function (which itself uses codec.open with utf-8):

>>> Orgnode.makelist("toread.org")
[**  [[http://www.apa.org/helpcenter/sexual-orientation.aspx][Sexual orientation, homosexuality and bisexuality]]            :ToRead:



Added:[2013-11-06 Wed]
, **  [[http://stackoverflow.com/questions/11384516/how-to-make-all-org-files-under-a-folder-added-in-agenda-list-automatically][emacs - How to make all org-files under a folder added in agenda-list automatically? - Stack Overflow]] 

(setq org-agenda-text-search-extra-files '(agenda-archives "~/org/subdir/textfile1.txt" "~/org/subdir/textfile1.txt"))
Added:[2013-07-23 Tue] 
, Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position 63-66: ordinal not in range(128)

The function returns a list of org headings, but instead of last item (which is written in Persian) it shows the error. Any suggestion how can I deal with this error?

Это было полезно?

Решение

As the traceback tells you, the exception is raised by the statement you input on the Python console itself (Orgnode.makelist("toread.org")), and not in one of the functions called during the evaluation of the statement.

This is typical of encoding errors when the interpreter automatically converts the return value of the statement to display it back on the console. The text displayed is the result of applying the repr() builtin to the return value.

Here the repr() of the result of makelist is a unicode object, which the interpreter tries to convert to str using the "ascii" codec by default.

The culprit is the Orgnode.__repr__ method (https://github.com/albins/orgnode/blob/master/Orgnode.py#L592) which return a unicode object (because node content has automatically been decoded with codecs.open), although __repr__ methods are usually expected to return strings with only safe (ASCII) characters.

Here is the smallest change you can do to Orgnode as a workaround for your problem:

-- a/Orgnode.py
+++ b/Orgnode.py
@@ -612,4 +612,4 @@ class Orgnode(object):
 # following will output the text used to construct the object
         n = n + "\n" + self.body

-        return n
+        return n.encode('utf-8')

If you want a version which only returns ASCII characters, you can use 'string-escape' as the codec instead of 'utf-8'.

This is only a quick and dirty fix. The right solution would be to rewrite a proper __repr__ method, and also add the __str__ and __unicode__ methods that this class lacks. (I might even fix this myself if I find the time, as I am quite interested in using Python code to manipulate my Org-mode files)

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top