Question

I want to write a cmp-like function which compares two version numbers and returns -1, 0, or 1 based on their compared valuses.

  • Return -1 if version A is older than version B
  • Return 0 if version A and B are equivalent
  • Return 1 if version A is newer than version B

Each subsection is supposed to be interpreted as a number, therefore 1.10 > 1.1.

Desired function outputs are

mycmp('1.0', '1') == 0
mycmp('1.0.0', '1') == 0
mycmp('1', '1.0.0.1') == -1
mycmp('12.10', '11.0.0.0.0') == 1
...

And here is my implementation, open for improvement:

def mycmp(version1, version2):
    parts1 = [int(x) for x in version1.split('.')]
    parts2 = [int(x) for x in version2.split('.')]

    # fill up the shorter version with zeros ...
    lendiff = len(parts1) - len(parts2)
    if lendiff > 0:
        parts2.extend([0] * lendiff)
    elif lendiff < 0:
        parts1.extend([0] * (-lendiff))

    for i, p in enumerate(parts1):
        ret = cmp(p, parts2[i])
        if ret: return ret
    return 0

I'm using Python 2.4.5 btw. (installed at my working place ...).

Here's a small 'test suite' you can use

assert mycmp('1', '2') == -1
assert mycmp('2', '1') == 1
assert mycmp('1', '1') == 0
assert mycmp('1.0', '1') == 0
assert mycmp('1', '1.000') == 0
assert mycmp('12.01', '12.1') == 0
assert mycmp('13.0.1', '13.00.02') == -1
assert mycmp('1.1.1.1', '1.1.1.1') == 0
assert mycmp('1.1.1.2', '1.1.1.1') == 1
assert mycmp('1.1.3', '1.1.3.000') == 0
assert mycmp('3.1.1.0', '3.1.2.10') == -1
assert mycmp('1.1', '1.10') == -1
Was it helpful?

Solution

Remove the uninteresting part of the string (trailing zeroes and dots), and then compare the lists of numbers.

import re

def mycmp(version1, version2):
    def normalize(v):
        return [int(x) for x in re.sub(r'(\.0+)*$','', v).split(".")]
    return cmp(normalize(version1), normalize(version2))

EDIT: same approach as Pär Wieslander, but a bit more compact.

Some tests, thanks to this post:

assert mycmp("1", "1") == 0
assert mycmp("2.1", "2.2") < 0
assert mycmp("3.0.4.10", "3.0.4.2") > 0
assert mycmp("4.08", "4.08.01") < 0
assert mycmp("3.2.1.9.8144", "3.2") > 0
assert mycmp("3.2", "3.2.1.9.8144") < 0
assert mycmp("1.2", "2.1") < 0
assert mycmp("2.1", "1.2") > 0
assert mycmp("5.6.7", "5.6.7") == 0
assert mycmp("1.01.1", "1.1.1") == 0
assert mycmp("1.1.1", "1.01.1") == 0
assert mycmp("1", "1.0") == 0
assert mycmp("1.0", "1") == 0
assert mycmp("1.0", "1.0.1") < 0
assert mycmp("1.0.1", "1.0") > 0
assert mycmp("1.0.2.0", "1.0.2") == 0

OTHER TIPS

How about using Python's distutils.version.StrictVersion?

>>> from distutils.version import StrictVersion
>>> StrictVersion('10.4.10') > StrictVersion('10.4.9')
True

So for your cmp function:

>>> cmp = lambda x, y: StrictVersion(x).__cmp__(y)
>>> cmp("10.4.10", "10.4.11")
-1

If you want to compare version numbers that are more complex distutils.version.LooseVersion will be more useful, however be sure to only compare the same types.

>>> from distutils.version import LooseVersion, StrictVersion
>>> LooseVersion('1.4c3') > LooseVersion('1.3')
True
>>> LooseVersion('1.4c3') > StrictVersion('1.3')  # different types
False

LooseVersion isn't the most intelligent tool, and can easily be tricked:

>>> LooseVersion('1.4') > LooseVersion('1.4-rc1')
False

To have success with this breed, you'll need to step outside the standard library and use setuptools's parsing utility parse_version.

>>> from pkg_resources import parse_version
>>> parse_version('1.4') > parse_version('1.4-rc2')
True

So depending on your specific use-case, you'll need to decide whether the builtin distutils tools are enough, or if it's warranted to add as a dependency setuptools.

Is reuse considered elegance in this instance? :)

# pkg_resources is in setuptools
# See http://peak.telecommunity.com/DevCenter/PkgResources#parsing-utilities
def mycmp(a, b):
    from pkg_resources import parse_version as V
    return cmp(V(a),V(b))

No need to iterate over the version tuples. The built in comparison operator on lists and tuples already works exactly like you want it. You'll just need to zero extend the version lists to the corresponding length. With python 2.6 you can use izip_longest to pad the sequences.

from itertools import izip_longest
def version_cmp(v1, v2):
    parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
    parts1, parts2 = zip(*izip_longest(parts1, parts2, fillvalue=0))
    return cmp(parts1, parts2)

With lower versions, some map hackery is required.

def version_cmp(v1, v2):
    parts1, parts2 = [map(int, v.split('.')) for v in [v1, v2]]
    parts1, parts2 = zip(*map(lambda p1,p2: (p1 or 0, p2 or 0), parts1, parts2))
    return cmp(parts1, parts2)

This is a little more compact than your suggestion. Rather than filling the shorter version with zeros, I'm removing trailing zeros from the version lists after splitting.

def normalize_version(v):
    parts = [int(x) for x in v.split(".")]
    while parts[-1] == 0:
        parts.pop()
    return parts

def mycmp(v1, v2):
    return cmp(normalize_version(v1), normalize_version(v2))

Remove trailing .0 and .00 with regex, split and use cmp function which compares arrays correctly.

def mycmp(v1,v2):
 c1=map(int,re.sub('(\.0+)+\Z','',v1).split('.'))
 c2=map(int,re.sub('(\.0+)+\Z','',v2).split('.'))
 return cmp(c1,c2)

and of course you can convert it to a one-liner if you don't mind the long lines

def compare_version(v1, v2):
    return cmp(*tuple(zip(*map(lambda x, y: (x or 0, y or 0), 
           [int(x) for x in v1.split('.')], [int(y) for y in v2.split('.')]))))

It's a one liner (split for legability). Not sure about readable...

Lists are comparable in python, so if one converts the strings representing the numbers in to integers, the basic python comparison can be used with success.

I however needed to extend a bit this approach, first cause I use python3x where cmp function does not exist any-more I had to emulate cmp(a,b) with (a > b) - (a < b).

Second, sadly, version numbers are not that clean at all, can contain all kind of other alphanumeric chars. There are cases when the function can't tell the order so return's False (see the first example).

So posting this even if the question is old and answered already, but may save a few minutes from ones life.

import re

def _preprocess(v, separator, ignorecase):
    if ignorecase: v = v.lower()
    return [int(x) if x.isdigit() else [int(y) if y.isdigit() else y for y in re.findall("\d+|[a-zA-Z]+", x)] for x in v.split(separator)]

def compare(a, b, separator = '.', ignorecase = True):
    a = _preprocess(a, separator, ignorecase)
    b = _preprocess(b, separator, ignorecase)
    try:
        return (a > b) - (a < b)
    except:
        return False

print(compare('1.0', 'beta13'))    
print(compare('1.1.2', '1.1.2'))
print(compare('1.2.2', '1.1.2'))
print(compare('1.1.beta1', '1.1.beta2'))

In case you don't want to pull in an external dependency here is an attempt of mine (written for python 3.x). "rc", "rel" (and possibly one could add "c") are regarded as "release candidate" and devide the version number into two parts and if missing the value of the second part is high (999). Else letters produce a split and are dealt as sub-numbers via base 36 code.


    import re
    from itertools import chain
    def compare_version(version1,version2):
        '''compares two version numbers
        >>> compare_version('1', '2') >> compare_version('2', '1') > 0
        True
        >>> compare_version('1', '1') == 0
        True
        >>> compare_version('1.0', '1') == 0
        True
        >>> compare_version('1', '1.000') == 0
        True
        >>> compare_version('12.01', '12.1') == 0
        True
        >>> compare_version('13.0.1', '13.00.02') >> compare_version('1.1.1.1', '1.1.1.1') == 0
        True
        >>> compare_version('1.1.1.2', '1.1.1.1') >0
        True
        >>> compare_version('1.1.3', '1.1.3.000') == 0
        True
        >>> compare_version('3.1.1.0', '3.1.2.10') >> compare_version('1.1', '1.10') >> compare_version('1.1.2','1.1.2') == 0
        True
        >>> compare_version('1.1.2','1.1.1') > 0
        True
        >>> compare_version('1.2','1.1.1') > 0
        True
        >>> compare_version('1.1.1-rc2','1.1.1-rc1') > 0
        True
        >>> compare_version('1.1.1a-rc2','1.1.1a-rc1') > 0
        True
        >>> compare_version('1.1.10-rc1','1.1.1a-rc2') > 0
        True
        >>> compare_version('1.1.1a-rc2','1.1.2-rc1') >> compare_version('1.11','1.10.9') > 0
        True
        >>> compare_version('1.4','1.4-rc1') > 0
        True
        >>> compare_version('1.4c3','1.3') > 0
        True
        >>> compare_version('2.8.7rel.2','2.8.7rel.1') > 0
        True
        >>> compare_version('2.8.7.1rel.2','2.8.7rel.1') > 0
        True

        '''
        chn = lambda x:chain.from_iterable(x)
        def split_chrs(strings,chars):
            for ch in chars:
                strings = chn( [e.split(ch) for e in strings] )
            return strings
        split_digit_char=lambda x:[s for s in re.split(r'([a-zA-Z]+)',x) if len(s)>0]
        splt = lambda x:[split_digit_char(y) for y in split_chrs([x],'.-_')]
        def pad(c1,c2,f='0'):
            while len(c1) > len(c2): c2+=[f]
            while len(c2) > len(c1): c1+=[f]
        def base_code(ints,base):
            res=0
            for i in ints:
                res=base*res+i
            return res
        ABS = lambda lst: [abs(x) for x in lst]
        def cmp(v1,v2):
            c1 = splt(v1)
            c2 = splt(v2)
            pad(c1,c2,['0'])
            for i in range(len(c1)): pad(c1[i],c2[i])
            cc1 = [int(c,36) for c in chn(c1)]
            cc2 = [int(c,36) for c in chn(c2)]
            maxint = max(ABS(cc1+cc2))+1
            return base_code(cc1,maxint) - base_code(cc2,maxint)
        v_main_1, v_sub_1 = version1,'999'
        v_main_2, v_sub_2 = version2,'999'
        try:
            v_main_1, v_sub_1 = tuple(re.split('rel|rc',version1))
        except:
            pass
        try:
            v_main_2, v_sub_2 = tuple(re.split('rel|rc',version2))
        except:
            pass
        cmp_res=[cmp(v_main_1,v_main_2),cmp(v_sub_1,v_sub_2)]
        res = base_code(cmp_res,max(ABS(cmp_res))+1)
        return res


    import random
    from functools import cmp_to_key
    random.shuffle(versions)
    versions.sort(key=cmp_to_key(compare_version))
from distutils.version import StrictVersion
def version_compare(v1, v2, op=None):
    _map = {
        '<': [-1],
        'lt': [-1],
        '<=': [-1, 0],
        'le': [-1, 0],
        '>': [1],
        'gt': [1],
        '>=': [1, 0],
        'ge': [1, 0],
        '==': [0],
        'eq': [0],
        '!=': [-1, 1],
        'ne': [-1, 1],
        '<>': [-1, 1]
    }
    v1 = StrictVersion(v1)
    v2 = StrictVersion(v2)
    result = cmp(v1, v2)
    if op:
        assert op in _map.keys()
        return result in _map[op]
    return result

Implement for php version_compare, except "=". Because it's ambiguous.

The most difficult to read solution, but a one-liner nevertheless! and using iterators to be fast.

next((c for c in imap(lambda x,y:cmp(int(x or 0),int(y or 0)),
            v1.split('.'),v2.split('.')) if c), 0)

that is for Python2.6 and 3.+ btw, Python 2.5 and older need to catch the StopIteration.

Another solution:

def mycmp(v1, v2):
    import itertools as it
    f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
    return cmp(f(v1), f(v2))

One can use like this too:

import itertools as it
f = lambda v: list(it.dropwhile(lambda x: x == 0, map(int, v.split('.'))[::-1]))[::-1]
f(v1) <  f(v2)
f(v1) == f(v2)
f(v1) >  f(v2)

Did this in order to be able to parse and compare the debian package version string. Please notice that it is not strict with the character validation.

This might be helpful as well.

#!/usr/bin/env python

# Read <https://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Version> for further informations.

class CommonVersion(object):
    def __init__(self, version_string):
        self.version_string = version_string
        self.tags = []
        self.parse()

    def parse(self):
        parts = self.version_string.split('~')
        self.version_string = parts[0]
        if len(parts) > 1:
            self.tags = parts[1:]


    def __lt__(self, other):
        if self.version_string < other.version_string:
            return True
        for index, tag in enumerate(self.tags):
            if index not in other.tags:
                return True
            if self.tags[index] < other.tags[index]:
                return True

    @staticmethod
    def create(version_string):
        return UpstreamVersion(version_string)

class UpstreamVersion(CommonVersion):
    pass

class DebianMaintainerVersion(CommonVersion):
    pass

class CompoundDebianVersion(object):
    def __init__(self, epoch, upstream_version, debian_version):
        self.epoch = epoch
        self.upstream_version = UpstreamVersion.create(upstream_version)
        self.debian_version = DebianMaintainerVersion.create(debian_version)

    @staticmethod
    def create(version_string):
        version_string = version_string.strip()
        epoch = 0
        upstream_version = None
        debian_version = '0'

        epoch_check = version_string.split(':')
        if epoch_check[0].isdigit():
            epoch = int(epoch_check[0])
            version_string = ':'.join(epoch_check[1:])
        debian_version_check = version_string.split('-')
        if len(debian_version_check) > 1:
            debian_version = debian_version_check[-1]
            version_string = '-'.join(debian_version_check[0:-1])

        upstream_version = version_string

        return CompoundDebianVersion(epoch, upstream_version, debian_version)

    def __repr__(self):
        return '{} {}'.format(self.__class__.__name__, vars(self))

    def __lt__(self, other):
        if self.epoch < other.epoch:
            return True
        if self.upstream_version < other.upstream_version:
            return True
        if self.debian_version < other.debian_version:
            return True
        return False


if __name__ == '__main__':
    def lt(a, b):
        assert(CompoundDebianVersion.create(a) < CompoundDebianVersion.create(b))

    # test epoch
    lt('1:44.5.6', '2:44.5.6')
    lt('1:44.5.6', '1:44.5.7')
    lt('1:44.5.6', '1:44.5.7')
    lt('1:44.5.6', '2:44.5.6')
    lt('  44.5.6', '1:44.5.6')

    # test upstream version (plus tags)
    lt('1.2.3~rc7',          '1.2.3')
    lt('1.2.3~rc1',          '1.2.3~rc2')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc1')
    lt('1.2.3~rc1~nightly2', '1.2.3~rc1')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc1~nightly2')
    lt('1.2.3~rc1~nightly1', '1.2.3~rc2~nightly1')

    # test debian maintainer version
    lt('44.5.6-lts1', '44.5.6-lts12')
    lt('44.5.6-lts1', '44.5.7-lts1')
    lt('44.5.6-lts1', '44.5.7-lts2')
    lt('44.5.6-lts1', '44.5.6-lts2')
    lt('44.5.6-lts1', '44.5.6-lts2')
    lt('44.5.6',      '44.5.6-lts1')

i'm using this one on my project:

cmp(v1.split("."), v2.split(".")) >= 0

My preferred solution:

Padding the string with extra zeroes and just using the four first is easy to understand, doesn't require any regex and the lambda is more or less readable. I use two lines for readability, for me elegance is short and simple.

def mycmp(version1,version2):
  tup = lambda x: [int(y) for y in (x+'.0.0.0.0').split('.')][:4]
  return cmp(tup(version1),tup(version2))

This is my solution (written in C, sorry). I hope you'll find it useful

int compare_versions(const char *s1, const char *s2) {
    while(*s1 && *s2) {
        if(isdigit(*s1) && isdigit(*s2)) {
            /* compare as two decimal integers */
            int s1_i = strtol(s1, &s1, 10);
            int s2_i = strtol(s2, &s2, 10);

            if(s1_i != s2_i) return s1_i - s2_i;
        } else {
            /* compare as two strings */
            while(*s1 && !isdigit(*s1) && *s2 == *s1) {
                s1++;
                s2++;
            }

            int s1_i = isdigit(*s1) ? 0 : *s1;
            int s2_i = isdigit(*s2) ? 0 : *s2;

            if(s1_i != s2_i) return s1_i - s2_i;
        }
    }

    return 0;
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top