Pergunta

Let's say you have a string S and a sequence of digits in a list L such that len(S) = len(L).

What would be the cleanest way of checking if you can find a bijection between the characters of the string to the digits in the sequence such that each character matches to one and only one digit.

For example, "aabbcc" should match with 115522 but not 123456 or 111111.

I have a complex setup with two dicts and loop, but I'm wondering if there's a clean way of doing this, maybe by using some function from the Python libraries.

Foi útil?

Solução

I would use a set for this:

In [9]: set("aabbcc")
Out[9]: set(['a', 'c', 'b'])

In [10]: set(zip("aabbcc", [1, 1, 5, 5, 2, 2]))
Out[10]: set([('a', 1), ('c', 2), ('b', 5)])

The second set will have length equal to the first set if and only if the mapping is surjective. (if it is not, you will have two copies of a letter mapping to the same number in the second set, or vice versa)

Here is code that implements the idea

def is_bijection(seq1, seq2):
    distinct1 = set(seq1)
    distinct2 = set(seq2)
    distinctMappings = set(zip(seq1, seq2))
    return len(distinct1) == len(distinct2) == len(distinctMappings)

This will also return true if one sequence is shorter than the other, but a valid mapping has already been established. If the sequences must be the same length, you should add a check for that.

Outras dicas

There's a more elegant way to do this (with sorting and itertools.groupby), but I'm wayy to sleep-deproved to figure that out right now. But this should still work:

In [172]: S = "aabbcc"

In [173]: L = [1, 1, 5, 5, 2, 2]

In [174]: mapping = collections.defaultdict(list)

In [175]: reverseMapping = collections.defaultdict(list)

In [176]: for digit, char in zip(L, S):
    mapping[digit].append(char)
    reverseMapping[char].append(digit)
   .....:     

In [177]: all(len(set(v))==1 for v in mapping.values()) and all(len(set(v))==1 for v in reverseMapping.values())
Out[177]: True

In [181]: S = "aabbcc"

In [182]: L = [1, 2, 3, 4, 5, 6]

In [183]: mapping = collections.defaultdict(list)

In [184]: reverseMapping = collections.defaultdict(list)

In [185]: for digit, char in zip(L, S):                                                                         
    mapping[digit].append(char)
    reverseMapping[char].append(digit)
   .....:     

In [186]: all(len(set(v))==1 for v in mapping.values()) and all(len(set(v))==1 for v in reverseMapping.values())
Out[186]: False

Hope this helps

This respects the order:

>>> s = "aabbcc"
>>> n = 115522
>>> l1 = dict(zip(s, str(n))).items()
>>> l2 = zip(s, str(n))
>>> l1
[('a', '1'), ('c', '2'), ('b', '5')]
>>> l2
[('a', '1'), ('a', '1'), ('b', '5'), ('b', '5'), ('c', '2'), ('c', '2')]
>>> not bool([i for i in l2 if i not in l1])
True
>>> n = 115225
>>> l1 = dict(zip(s, str(n))).items()
>>> l2 = zip(s, str(n))
>>> not bool([i for i in l2 if i not in l1])
False

Since you normally only talk about bijections between sets, I assume, unlike the other answers, that the order of the digits need not match the order of the letters. If so, there's a short, elegant solution, but it requires the collections.Counter class, which was introduced in python 2.7. For those stuck with an older version, there's a backport for 2.5+.

from collections import Counter

def bijection_exists_between(a, b):
    return sorted(Counter(a).values()) == sorted(Counter(b).values())

Testing:

>>> bijection_exists_between("aabbcc", "123123")
True
>>> bijection_exists_between("aabbcc", "123124")
False

Your examples are rather light on the edge cases, because another way of reading your question allows for the number of digits and number of letters to be unequal (i.e. you look for a bijection from the set of unique characters to the set of unique digits, so e.g. "aabbcc" would biject onto "123333".). If this is what you meant, use this version instead:

def bijection_exists_between(a, b):
    return len(set(a)) == len(set(b))
import itertools

a = 'aabbcc'
b = 112233

z = sorted(zip(str(a), str(b)))
x = all(
    gx == g0
    for k, g in itertools.groupby(z, key=lambda x: x[0])
    for gx in g for g0 in g
)
print x

or:

import itertools

a = 'aabbcc'
b = 112233

z = zip(str(a), str(b))
x = all(
    (z1[0] == z2[0]) == (z1[1] == z2[1]) for z1 in z for z2 in z
)
print x
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top