Language of the values of an affine function

https://cs.stackexchange.com/questions/640

16-10-2019
|

Question

Write $\bar n$ for the decimal expansion of $n$ (with no leading 0). Let $a$ and $b$ be integers, with $a > 0$. Consider the language of the decimal expansions of the multiples of $a$ plus a constant:

$$M = \{ \overline{a\,x+b} \mid x\in\mathbb{N} \}$$

Is $M$ regular? context-free?

(Contrast with Language of the graph of an affine function)

_{I think this would make a good homework question, so answers that start with a hint or two and explain not just how to solve the question but also how to decide what techniques to use would be appreciated.}

Solution 5

The language is regular.

Hint: cast out nines

Proof idea

For $a=9$ and $b < 9$,

build an automaton with $9$ states labeled $0$ through $8$. $0$ is the initial state, and the one final state is $b$. From state $s$, on digit $d$, transition to state $(s + d) \;\mathrm{mod}\; 9$.

To handle other values of $a$ that are coprime with $10$,

group digits in packets to find some $k$ such that $a$ divides $10^k-1$ (e.g. take $k=3$ if $a=37$ because $999 = 27 \times 37$).

To handle values of $a$ whose only prime factors are $2$ and $5$,

note that it's all about a finite number of digits at the end.

To generalize to all values of $a$ and $b$,

use the fact that union and intersection of regular languages are regular, that finite languages are regular, and that the multiples of $a_1 \cdot a_2$ are exactly the multiples of both when $a_1$ and $a_2$ are coprime.

Note that we use whichever technique is convenient; the three main elementary techniques (regular expressions, finite automata, set-theoretic properties) are all represented in this proof.

Detailed proof

Let $a = 2^p 5^q a'$ with $a'$ coprime with $10$. Let $M' = \{\overline{a'\,x+b} \mid x\in\mathbb{Z} \wedge a'\,x+b \ge 0\}$ and $M'' = \{\overline{2^p 5^q\,x+b} \mid x\in\mathbb{Z} \wedge 2^p 5^q\,x+b \ge 0\}$. By elementary arithmetic, the numbers equal to $b$ modulo $a$ are exactly the numbers equal to $b$ modulo $a'$ and to $b$ modulo $2^p5^q$, so $M \cap \{\overline{x} \mid x \ge b\} = M' \cap M'' \cap \{\overline{x} \mid x \ge b\}$. Since the intersection of regular languages is regular, and $\{\overline{x} \mid x \ge b\}$ is regular because it is the complement of a finite (hence regular) language, if $M'$ and $M''$ are also regular, then $M \cap \{\overline{x} \mid x \ge b\}$ is regular; and $M$ is therefore regular since it is the union of that language with a finite set. So to conclude the proof it suffices to prove that $M'$ and $M''$ are regular.

Let us start with $M''$, i.e. numbers modulo $2^p 5^q$. The integers whose decimal expansion is in $M''$ are characterized by their last $\mathrm{max}(p,q)$ digits, since changing digits further left means adding a multiple of $10^{\mathrm{max}(p,q)}$ which is a multiple of $2^p 5^q$. Hence $0^* M'' = \aleph^* F$ where $\aleph$ is the alphabet of all digits and $F$ is a finite set of words of length $\mathrm{max}(p,q)$, and $M'' = (\aleph^* F) \cap ((\aleph \setminus \{0\}) \aleph^*)$ is a regular language.

We now turn to $M'$, i.e. numbers modulo $a'$ where $a'$ is coprime with $10$. If $a' = 1$ then $M'$ is the set of decimal expansions of all naturals, i.e. $M' = \{0\} \cup ((\aleph \setminus \{0\}) \aleph^*)$, which is a regular language. We now assume $a' > 1$. Let $k = a'-1$. By Fermat's little theorem, $10^{a'-1} \equiv 1 \mod a'$, which is to say that $a'$ divides $10^k-1$. We build a deterministic finite automaton that will recognize $0^* M'$ as follows:

The states are $[0,k-1] \times [0,10^k-2]$. The first part represents a digit position and the second part represents a number modulo $10^k-1$.
The initial state is $(0,0)$.
There is a transition labeled $d$ from $(i,u)$ to $(j,v)$ iff $v \equiv d 10^i + u \mod 10^k-1$ and $j \equiv i + 1 \mod k$.
A state $(i,u)$ is final iff $u \equiv b \mod a'$ (note that $a'$ divides $10^k-1$).

The state $(i,u)$ reached from a word $\overline{x}$ satisfies $i \equiv |\overline{x}| \mod k$ and $u \equiv x \mod 10^k-1$. This can be proved by induction over the word, following the transitions on the automaton; the transitions are calculated for this, using the fact that $10^k \equiv 1 \mod 10^k-1$. Thus the automaton recognizes the decimal expansions (allowing initial zeroes) of the numbers of the form $u + y 10^k$ with $u \equiv b \mod a'$; since $10^k \equiv 1 \mod a'$, the automaton recognizes the decimal expansions of the numbers equal to $b$ modulo $a'$ allowing initial zeroes, which is $0^* M'$. This language is thus proved regular. Finally, $M' = (0^* M') \cap ((\aleph \setminus \{0\}) \aleph^*)$ is a regular language.

To generalize to bases other than $10$, replace $2$ and $5$ above by all the prime factors of the base.

Formal proof

Left as an exercise for the reader, in your favorite theorem prover.

OTHER TIPS

Very simple: Suppose numbers are written in decimal (other bases are handled by a trivial modification). Build a DFA, with $a$ states 0, 1, ..., $a - 1$. Start state is 0, and from state $q$ on input the digit $d$ go to state $(10 q + d) \bmod a$. Accepting state is $b \bmod a$ (might need a tweak if $b > a$).

It is regular. Let's first work in binary, which will generalize to any base > 1. Let $M_{a,b}$ be the language in question. For a = 1, b = 0 we get

$M_{1,0} = \{1, 10, 11, 100, 101, ...\}$

which is all strings over $\{0,1\}$ without leading zeroes, which is regular (construct a regular expression for it).

Now for any $a$, with b still 0 we get $M_{a,0}$ from $M_{1,0}$ by multiplying numerically by a, that is, performing the transformation $\bar x \rightarrow \overline {ax}$ on each string of $M_{a,0}$. That can be done bitwise by a sequence of shifts and additions of $x$ that depend on the bits of fixed string $\bar a$. The two transformations we need are:

$\bar x \rightarrow \overline {2x}$ which is $\bar x \rightarrow \bar x0$

and

$\bar x \rightarrow \overline {2x+x}$

Concatenating a 0 on the right clearly preserves regularity. We thus have only to prove that the second operation preserves regularity. The way to do that is with a finite-state transducer working on $\bar x$ from right to left. That's the hardest step. I suggest doing it with a pseudo-code program and some finite auxilliary memory (like some bit variables) rather than using states. As long as the memory is of a fixed size over all inputs and you scan strictly right to left, it's a finite state transduction and preserves regularity.

Finally, to get $M_{a,b}$ from $M_{a,0}$ we need to numerically add $b$ to each string, but that's done with a similar transducer $T_b$ which depends on the fixed number b.

To do the same in any base, show additionally how to perform multiplication by a single digit $d$ in that base, using a transducer $S_d$ that depends on d. With that, we can multiply by multi-digit numbers and still remain in the regular languages. Or, we can finesse this by saying that multiplication by $d$ is just repeated addition.

You wanted only hints. Each of those steps depends on a fairly complex theorem/technique, so proving those to get a complete proof will be the extra work.

Hint #1: first solve the more popular problem "write a automaton that recognizes the decimal/binary representations of numbers divisible by 3" when the least signigicant bit appears first.

Intermediate question: prove that $\{ \overline{a\,x+b}\mid a\,x+b≥0 \quad x\in\mathbb{Z}\}$ is regular.

Hint #2: The graph of the function $(n \mapsto 10n+d)$ "modulo $a$" is finite. You can compute it for each $d$ in $\{0, \dots, 9\}$ which is both the set of digits and the language of your automaton.

Hint #3: now, replace $x∈\mathbb Z$ with $x∈\mathbb N$. What does this change? Try to use the fact that regular languages are stable by intersection instead of building an ad-hoc automaton.

Hint #4: now assume that $a$ is a prime number (so that $\mathbb Z/a\mathbb Z$ is a field) and that we are still in the case where $x∈\mathbb Z$. We now use a representation where the first bit is the most significant bit. Can you build the automaton the same way?

Hint #5: you don't always have to build an automaton to prove a language is regular. How can you use previous results to prove that $M$ is regular? (with the most significant bit first)

I am following the idea of @vonbrand:

Hint 1:

Solve first for $b=0$. You might use the Myhill-Nerode Theorem.

We define the following relation $\bar x\approx \bar y :\iff x\equiv y \mod a$. This is obviously an equivalence relation. Furthermore it is right-congruent, since if we append a digit $d$ we get $\bar x \approx \bar y \iff 10x+d \equiv 10y+d \mod a \iff \bar x d \approx \bar y d$. Finally it saturates $L$, since $L$ is the equivalence class $[0]$. Since $\approx$ has a finite number of classes we have by the Myhill-Nerode Theorem that it is regular. The associated FSA is minimal and has $a$ states.

Hint 2:

Solve the general case, reuse the automaton induced by the $b=0$ case.

We know that the language is regular for $b=0$. Hence simply take the $a$ state FSA $M$ for the $b=0$ language. Now we construct a FSA for $L$. Assume that $b$ has $k$ digits. Then the FSA branches like a 10-ary tree of depth $k$ and contains all paths to numbers with $k$ digits. All states that can be reached with a number not in the form $ax+b$ are rejecting otherwise accepting. Finally we connect the tree-part of the FSA with the automaton $M$, according to the remainder by the division by $a$.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange