Question

I cannot request url "http://www.besondere-raumdüfte.de" with urllib2.urlopen().
I tried to encode string using urllib.urlencode with utf-8, idna, ascii But still doesn't work.
Raises URLError: <urlopen error unknown url type.

Was it helpful?

Solution

What you need is u"http://www.besondere-raumdüfte.de/".encode('idna'). Please note how the source string is a Unicode constant (the u prefix).

The result is an URL usable with urlopen().

If you have a domain name with non-ASCII characters and the rest of the URL contains non-ASCII characters, you need to .encode('idna') the domain part and iri2uri() the rest.

OTHER TIPS

You are working with an iri and not a uri, what you have to do is convert it correctly. The following is an example on how to do it:

from httplib2 import iri2uri

def iri_to_uri(iri):
    """Transform a unicode iri into a ascii uri."""
    if not isinstance(iri, unicode):
        raise TypeError('iri %r should be unicode.' % iri)
    return bytes(iri2uri(iri))

Once you have an uri you should be able to use urllib2.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top