Question

I'm working on a system that will require manual data entry of 10-digit numbers (Σ = 0123456789). To help prevent data errors, I want to avoid generating any two strings that have a hamming distance of 1.

For example, if I generate the string 0123456789 then I never want to generate any of these strings: {1123456789, 2123456789, 3123456789, ...}

What is the largest set of unique strings in the universe of possible strings that satisfy the constraint where no two strings have a hamming distance of 1? If this set can be identified, is there any reasonable way to enumerate it?

Was it helpful?

Solution

There is a simple solution: use exactly the strings with sums of digits being divisible by $10$. There are $10^9$ such strings and it is easy to enumerate them, find $i$-th of them in the lexicographic order, generate random such string, et cetera.

Indeed, if two strings differ in exactly one position, then their sums of digits are also different modulo $10$. For example, the sums of digits for strings $0123456789$, $1123456789$, $\ldots$, $9123456789$ are $45$, $46$, $\ldots$, $54$ respectively.

Moreover, this solution is largest possible. Indeed, there are $10^9$ such strings: for every way to choose first $9$ digits, there is an exactly one way to choose the last digit to make the sum divisible by $10$. Hence, there are $10^9$ strings in our set.

On the other hand, any set of strings with minimal Hamming distance $2$ has at most $10^9$ elements. Indeed, consider $10$ strings with a given prefix of length $9$. They all are within Hamming distance $1$ of each other (because they differ only in the last position). Hence, at most one of them can belong to any "good" set. Therefore, any "good" set contains at most $10^9$ elements.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top