Question

I was seeking for a way to test whether a tuple has all distinct elements - so to say, it is a set and ended up with this quick and dirty solution.

def distinct ( tup):
    n=0
    for t in tup:
        for k in tup:
            #print t,k,n
            if (t == k ):
                n = n+1
    if ( n != len(tup)):
        return False
    else :
        return True


print distinct((1,3,2,10))

print distinct((3,3,4,2,7))

Any thinking error? Is there a builtin on tuples?

Was it helpful?

Solution

You can very easily do it as:

len(set(tup))==len(tup)

This creates a set of tup and checks if it is the same length as the original tup. The only case in which they would have the same length is if all elements in tup were unique

Examples

>>> a = (1,2,3)
>>> print len(set(a))==len(a)
True

>>> b = (1,2,2)
>>> print len(set(b))==len(b)
False

>>> c = (1,2,3,4,5,6,7,8,5)
>>> print len(set(c))==len(c)
False

OTHER TIPS

In the majority of the cases, where all of the items in the tuple are hashable and support comparison (using == operator) with each other, @sshashank124's solution is what you're after:

len(set(tup))==len(tup)

For the example you posted, i.e. a tuple of ints, that would do.


Else, if the items are not hashable, but do have order defined on them (support '==', '<' operators, etc.), the best you can do is sorting them (O(NlogN) worst case), and then look for adjacent equal elements:

sorted_tup = sorted(tup)
all( x!=y for x,y in zip(sorted_tup[:-1],sorted_tup[1:]) ) 

Else, if the items only support equality comparison (==), the best you can do is the O(N^2) worst case algorithm, i.e. comparing every item to every other.

I would implement it this way, using itertools.combinations:

def distinct(tup):
  for x,y in itertools.combinations(tup, 2):
    if x == y:
      return False
  return True

Or as a one-liner:

all( x!=y for x,y in itertools.combinations(tup, 2) )

What about using early_exit:

This will be faster:

def distinct(tup):
    s = set()
    for x in tup:
        if x in s:
            return False
        s.add(x)
    return True

Ok: Here is the test and profiling with 1000 int:

#!/usr/bin/python

def distinct1(tup):
    n=0
    for t in tup:
        for k in tup:
            if (t == k ):
                n = n+1
    if ( n != len(tup)):
        return False
    else :
        return True

def distinct2(tup):
    return len(set(tup))==len(tup)

def distinct3(tup):
    s = set()
    for x in tup:
        if x in s:
            return False
        s.add(x)
    return True

import cProfile
from faker import Faker
from timeit import Timer

s = Faker()
func = [ distinct1, distinct2, distinct3 ]
tuples = tuple([ s.random_int(min=1, max=9999) for x in range(1000) ])

for fun in func:
    t = Timer(lambda: fun(tuples))
    print fun.__name__, cProfile.run('t.timeit(number=1000)')

Output: distinct 2 and distinct3 are almost the same.

distinct1          3011 function calls in 60.289 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   60.289   60.289 <string>:1(<module>)
     1000   60.287    0.060   60.287    0.060 compare_tuple.py:3(distinct1)
     1000    0.001    0.000   60.288    0.060 compare_tuple.py:34(<lambda>)
        1    0.000    0.000    0.000    0.000 timeit.py:143(setup)
        1    0.000    0.000   60.289   60.289 timeit.py:178(timeit)
        1    0.001    0.001   60.289   60.289 timeit.py:96(inner)
        1    0.000    0.000    0.000    0.000 {gc.disable}
        1    0.000    0.000    0.000    0.000 {gc.enable}
        1    0.000    0.000    0.000    0.000 {gc.isenabled}
        1    0.000    0.000    0.000    0.000 {globals}
     1000    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {time.time}


None
distinct2          4011 function calls in 0.053 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.053    0.053 <string>:1(<module>)
     1000    0.052    0.000    0.052    0.000 compare_tuple.py:14(distinct2)
     1000    0.000    0.000    0.053    0.000 compare_tuple.py:34(<lambda>)
        1    0.000    0.000    0.000    0.000 timeit.py:143(setup)
        1    0.000    0.000    0.053    0.053 timeit.py:178(timeit)
        1    0.000    0.000    0.053    0.053 timeit.py:96(inner)
        1    0.000    0.000    0.000    0.000 {gc.disable}
        1    0.000    0.000    0.000    0.000 {gc.enable}
        1    0.000    0.000    0.000    0.000 {gc.isenabled}
        1    0.000    0.000    0.000    0.000 {globals}
     2000    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {time.time}


None
distinct3          183011 function calls in 0.072 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.072    0.072 <string>:1(<module>)
     1000    0.051    0.000    0.070    0.000 compare_tuple.py:17(distinct3)
     1000    0.002    0.000    0.072    0.000 compare_tuple.py:34(<lambda>)
        1    0.000    0.000    0.000    0.000 timeit.py:143(setup)
        1    0.000    0.000    0.072    0.072 timeit.py:178(timeit)
        1    0.000    0.000    0.072    0.072 timeit.py:96(inner)
        1    0.000    0.000    0.000    0.000 {gc.disable}
        1    0.000    0.000    0.000    0.000 {gc.enable}
        1    0.000    0.000    0.000    0.000 {gc.isenabled}
        1    0.000    0.000    0.000    0.000 {globals}
   181000    0.019    0.000    0.019    0.000 {method 'add' of 'set' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        2    0.000    0.000    0.000    0.000 {time.time}


None
a = (1,2,3,4,5,6,7,8,9)
b = (1,2,3,4,5,6,6,7,8,9)
def unique(tup):
    i = 0
    while i < len(tup):
        if tup[i] in tup[i+1:]:
            return False
        i = i + 1    
    return True    

unique(a)
True
unique(b)
False

Readable and works with hashable items. People seem adverse to "while". This also allows for "special cases" and filtering on attributes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top