Question

I've noticed that Python lets me do this:

>>> {1: "foo"} < {2: "bar"}
True

It lets me do the same thing for lists, deques, etc. What are the semantics of < when applied to dictionaries in Python?

In general where can I find out the semantics of < for any given type of collection? In most cases it seems not to be found in the documentation. For example:

>>> help(dict.__cmp__)

Help on wrapper_descriptor:

__cmp__(...)
    x.__cmp__(y) <==> cmp(x,y)

>>> help(cmp)

Help on built-in function cmp in module __builtin__:

cmp(...)
    cmp(x, y) -> integer

    Return negative if x<y, zero if x==y, positive if x>y.

I ask because I have a list of tuples of the form (int, dict). I want to sort this array based on the first element, but if the first elements are equal for two items then I don't care about the second. I'd like to know if myArray.sort() will do something complicated involving recursing through the dicts in this case, or if it will just return an arbitrary value.

Was it helpful?

Solution

Quoting from comparison docs,

Tuples and Lists

Tuples and lists are compared lexicographically using comparison of corresponding elements. This means that to compare equal, each element must compare equal and the two sequences must be of the same type and have the same length.

If not equal, the sequences are ordered the same as their first differing elements. For example, cmp([1,2,x], [1,2,y]) returns the same as cmp(x,y). If the corresponding element does not exist, the shorter sequence is ordered first (for example, [1,2] < [1,2,3]).

Dictionaries

Mappings (dictionaries) compare equal if and only if their sorted (key, value) lists compare equal. (The implementation computes this efficiently, without constructing lists or sorting.) Outcomes other than equality are resolved consistently, but are not otherwise defined. (Earlier versions of Python [prior to 2.7.6] used lexicographic comparison of the sorted (key, value) lists, but this was very expensive for the common case of comparing for equality. An even earlier version of Python compared dictionaries by identity only, but this caused surprises because people expected to be able to test a dictionary for emptiness by comparing it to {}.)

Also, find this part of the documentation, which specifically comparing sequence types with themselves and other types,

Sequence objects may be compared to other objects with the same sequence type. The comparison uses lexicographical ordering: first the first two items are compared, and if they differ this determines the outcome of the comparison; if they are equal, the next two items are compared, and so on, until either sequence is exhausted. If two items to be compared are themselves sequences of the same type, the lexicographical comparison is carried out recursively. If all items of two sequences compare equal, the sequences are considered equal. If one sequence is an initial sub-sequence of the other, the shorter sequence is the smaller (lesser) one. Lexicographical ordering for strings uses the ASCII ordering for individual characters.

Note that comparing objects of different types is legal. The outcome is deterministic but arbitrary: the types are ordered by their name. Thus, a list is always smaller than a string, a string is always smaller than a tuple, etc. (The rules for comparing objects of different types should not be relied upon; they may change in a future version of the language.) Mixed numeric types are compared according to their numeric value, so 0 equals 0.0, etc.

Actual dictionary comparison, as per Python 2.7 source code, goes like this

  1. Compare the length of keys first. (-1 is returned if first has lesser keys, 1 if second has lesser keys)

  2. If they are the same, then it tries to find a key for which either the key is missing in the other or different (this is called as characterizing the dict)

  3. It does the step 2, either ways, both a, b and b, a. If either of them is empty, then both the dictionaries are assumed to be equal.

  4. Now, the differences we got from characterizing the dictionaries will be compared to get the actual comparison result.

OTHER TIPS

Like the answer from @thefourtheye.

Written in python, it can be explained like this:

def dict_compare(a, b):
    if len(a) != len(b):                # STEP 1: compare by length
        return -1 if len(a) < len(b) else 1

    res = 0
    akey, aval = characterize(a, b)     # Find first k, v that a[k] != b[k]
    bkey, bval = characterize(b, a)
    if akey is None:  # if no difference
        return 0
    if bkey is not None:               # STEP 2: compare by key
        res = cmp(akey, bkey)
    if res == 0 and bval is not None:  # STEP 3: compare by value
        res = cmp(aval, bval)
    return res

Where characterize fucntion is something like this:

def characterize(a, b):
    """Find the first k that a[k] != b[k]"""
    akey, aval = None, None
    for k, v in a.items():
        if akey < k:
            continue
        if (k not in b) or (a != b[k]):
            akey, aval = k, v

    return akey, aval
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top