Question

I'm using python-twiter to search for tweets using Twitter's API and I have a problem with chinese terms. Here is a minimal code sample to reproduce the problem:

# -*- coding: utf-8 -*-
import twitter

api = twitter.Api(consumer_key = "...", consumer_secret = "...",
                  access_token_key = "...", access_token_secret = "...")

api.VerifyCredentials()
print u"您说英语吗"
r = api.GetSearch(term=u"您说英语吗")

I get this error:

您说英语吗
Traceback (most recent call last):
          File "so.py", line 9, in <module>
    r = api.GetSearch(term=u"您说英语吗")
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 2419, in GetSearch
    json = self._FetchUrl(url, parameters=parameters)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/python_twitter-0.8.7-py2.7.egg/twitter.py", line 4041, in _FetchUrl
    url = req.to_url()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/oauth2-1.5.211-py2.7.egg/oauth2/__init__.py", line 440, in to_url
    urllib.urlencode(query, True), fragment)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib.py", line 1337, in urlencode
    l.append(k + '=' + quote_plus(str(elt)))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-4: ordinal not in range(128)
Was it helpful?

Solution

Seems like there is a bug in GetSearch: https://code.google.com/p/python-twitter/issues/detail?id=210. I've tried to search for "Putin" in Russian ("Путин") and got the same error too. Playing with encoding didn't help.

As a workaround, you can use twitter package (https://github.com/sixohsix/twitter):

# -*- coding: utf-8 -*-
from twitter import *

t = Twitter(auth=OAuth(token="...", token_secret="...", consumer_key="...", consumer_secret="...")))

print t.search.tweets(q=u"您说英语吗")

OTHER TIPS

Also, try adding the code below before using non-English text

import sys

reload(sys)

sys.setdefaultencoding("utf-8")

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top