Why are return values from compare functions in many languages defined loosely?

https://softwareengineering.stackexchange.com/questions/269900

06-10-2020
|

Question

Many languages define that compare functions should return ANY negative value, zero, or ANY positive value. Is there some reason that it shouldn't be clearly defined as -1 0 and 1? Does a wide range in possible return values help more advanced algorithms to work more efficiently? If so, what algorithms work this way?

Solution

If you allow comparison function to return any negative value instead of exactly -1, this can make for a simpler implementation. For instance, you can write

return this.position - that.position;

instead of having to write

if(this.position == that.position) {
    return 0;
}
else if(this.position < that.position) {
    return -1;
} else
    return 1;
}

(The alternative is to use an operator, like Perl's <=>, that generates exactly -1, 0, or 1. But it's easier to define a lenient API than to get a new operator into your language, unless you're Larry Wall.)

OTHER TIPS

For simple use cases it allows for a very trivial implementation:

public int compare(Child a, Child b) {
    return a.age - b.age;
}
//sort children by age to determine who babysits

Meanwhile if more complex logic is required it is still easy to return the magic numbers -1, 0, or 1 once order is determined.

As CodesInChaos says in the comments the subtraction method fails to accommodate any overflow that may occur. General purpose libraries require greater robustness and complexity in their comparisons.

Here are a couple of a battle tested implementations:

Java's Integer.compare():

public static int compare(int x, int y) {
    return (x < y) ? -1 : ((x == y) ? 0 : 1);
}

Mono's Int32.CompareTo():

public int CompareTo (object value)
{
    if (value == null)
        return 1;

    if (!(value is System.Int32))
        throw new ArgumentException (Locale.GetText ("Value is not a System.Int32"));

    int xv = (int) value;
    if (m_value == xv)
        return 0;
    if (m_value > xv)
        return 1;
    else
        return -1;
}

As you can see both of these apply the required logic then return the appropriate magic number.

Return value semantics has almost nothing to do with implementation. This specification leaves no space for unspecified behavior and is statically verifiable.

If return value is specified to be an arbitrary number, this can checked statically - compiler can easily verify that function doesn't return string. Otherwise static verification is impossible - values that are outside of given range can't be eliminated statically for non-enum types. Therefore calling code will have to deal with unspecified values in runtime, emitting more unspecified behavior.

We can safely say that this decision is made to mimic behavior Jörg described in his comment - complete specification, every option covered.

It's because when you do a compare you are usually just taking the difference of two values and instead of wasting more time and effort into mapping the result to {-1,0,1}, just leave the result and define the function as so. For example x.comparedTo(y) = 0 means that if x and y are both 5, (5 - 5) = 0 but if x = 100 and y = 150, (100 - 150) = -50 and so x < y.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange