Domanda

When netloc is empty urlparse.urlunparse is inconsistent:

>>> urlparse.urlunparse(('http','','test_path', None, None, None))
'http:///test_path'
>>> urlparse.urlunparse(('ftp','','test_path', None, None, None))
'ftp:///test_path'
>>> urlparse.urlunparse(('ssh','','test_path', None, None, None))
'ssh:test_path'

Is it a bug or a feature? I would expect urlunparse to behave always, as in first example, even if scheme is not recognized.

È stato utile?

Soluzione

The data tuple you are passing to urlunparse has the following components:

scheme, netloc, url, query, fragment = data

When there is no netloc, and the scheme is not in uses_netloc, the url is

    url = scheme + ':' + url

That is the way urlunparse (which calls urlunsplit) is defined:

def urlunsplit(data):
    ...
    scheme, netloc, url, query, fragment = data
    if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
        if url and url[:1] != '/': url = '/' + url
        url = '//' + (netloc or '') + url
    if scheme:
        url = scheme + ':' + url

Note that 'ssh' is not in uses_netloc:

uses_netloc = ['ftp', 'http', 'gopher', 'nntp', 'telnet',
               'imap', 'wais', 'file', 'mms', 'https', 'shttp',
               'snews', 'prospero', 'rtsp', 'rtspu', 'rsync', '',
               'svn', 'svn+ssh', 'sftp','nfs','git', 'git+ssh']

You do get a url that begins with ssh:// if you supply a netloc:

In [140]: urlparse.urlunparse(('ssh','netloc','test_path', None, None, None))
Out[140]: 'ssh://netloc/test_path'
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top