Linear feedback shift register?

https://stackoverflow.com/questions/3735217

03-10-2019
|

Question

Lately I bumped repeatedly into the concept of LFSR, that I find quite interesting because of its links with different fields and also fascinating in itself. It took me some effort to understand, the final help was this really good page, much better than the (at first) cryptic wikipedia entry. So I wanted to write some small code for a program that worked like a LFSR. To be more precise, that somehow showed how a LFSR works. Here's the cleanest thing I could come up with after some lenghtier attempts (Python):

def lfsr(seed, taps):
    sr, xor = seed, 0
    while 1:
        for t in taps:
            xor += int(sr[t-1])
        if xor%2 == 0.0:
            xor = 0
        else:
            xor = 1
        print xor
        sr, xor = str(xor) + sr[:-1], 0
        print sr
        if sr == seed:
            break

lfsr('11001001', (8,7,6,1))      #example

I named "xor" the output of the XOR function, not very correct. However, this is just meant to show how it circles through its possible states, in fact you noticed the register is represented by a string. Not much logical coherence.

This can be easily turned into a nice toy you can watch for hours (at least I could :-)

def lfsr(seed, taps):
    import time
    sr, xor = seed, 0
    while 1:
        for t in taps:
            xor += int(sr[t-1])
        if xor%2 == 0.0:
            xor = 0
        else:
            xor = 1
        print xor
        print
        time.sleep(0.75)
        sr, xor = str(xor) + sr[:-1], 0
        print sr
        print
        time.sleep(0.75)

Then it struck me, what use is this in writing software? I heard it can generate random numbers; is it true? how? So, it would be nice if someone could:

explain how to use such a device in software development
come up with some code, to support the point above or just like mine to show different ways to do it, in any language

Also, as theres not much didactic stuff around about this piece of logic and digital circuitry, it would be nice if this could be a place for noobies (like me) to get a better understanding of this thing, or better, to understand what it is and how it can be useful when writing software. Should have made it a community wiki?

That said, if someone feels like golfing... you're welcome.

Solution

Actually, algorithms based on LFSR are very common. CRC is actually directly based on LFSR. Of course, in computer science classes people talk about polynomials when they're talking about how the input value is supposed to be XORed with the accumulated value, in electornics engineering we talk about taps instead. They are the same just different terminology.

CRC32 is a very common one. It's used to detect errors in Ethernet frames. That means that when I posted this answer my PC used an LFSR based algorithm to generate a hash of the IP packet so that my router can verify that what it's transmitting isn't corrupted.

Zip and Gzip files are another example. Both use CRC for error detection. Zip uses CRC32 and Gzip uses both CRC16 and CRC32.

CRCs are basically hash functions. And it's good enough to make the internet work. Which means LFSRs are fairly good hash functions. I'm not sure if you know this but in general good hash functions are considered good random number generators. But the thing with LFSR is that selecting the correct taps (polynomials) is very important to the quality of the hash/random number.

Your code is generally toy code since it operates on a string of ones and zeros. In the real world LFSR work on bits in a byte. Each byte you push through the LFSR changes the accumulated value of the register. That value is effectively a checksum of all the bytes you've push through the register. Two common ways of using that value as a random number is to either use a counter and push a sequence of numbers through the register, thereby transforming the linear sequence 1,2,3,4 to some hashed sequence like 15306,22,5587,994, or to feed back the current value into the register to generate a new number in seemingly random sequence.

It should be noted that doing this naively using bit-fiddling LFSR is quite slow since you have to process bits at a time. So people have come up with ways using pre-calculated tables to do it eight bits at a time or even 32 bits at a time. This is why you almost never see LFSR code in the wild. In most production code it masquerades as something else.

But sometimes a plain bit-twiddling LFSR can come in handy. I once wrote a Modbus driver for a PIC micro and that protocol used CRC16. A pre-calculated table requires 256 bytes of memory and my CPU only had 68 bytes (I'm not kidding). So I had to use an LFSR.

OTHER TIPS

Since I was looking for a LFSR-implementation in Python, I stumbled upon this topic. I found however that the following was a bit more accurate according to my needs:

def lfsr(seed, mask):
    result = seed
    nbits = mask.bit_length()-1
    while True:
        result = (result << 1)
        xor = result >> nbits
        if xor != 0:
            result ^= mask

        yield xor, result

The above LFSR-generator is based on GF(2^k) modulus calculus (GF = Galois Field). Having just completed an Algebra course, I'm going to explain this the mathematical way.

Let's start by taking, for example, GF(2⁴), which equals to {a₄x⁴ + a₃x³ + a₂x² + a₁x¹ + a₀x⁰ | a₀, a₁, ..., a₄ ∈ Z₂} (to clarify, Z_n = {0,1,...,n-1} and therefore Z₂ = {0,1}, i.e. one bit). This means that this is the set of all polynomials of the fourth degree with all factors either being present or not, but having no multiples of these factors (e.g. there's no 2x^k). x³, x⁴ + x³, 1 and x⁴ + x³ + x² + x + 1 are all examples of members of this group.

We take this set modulus a polynomial of the fourth degree (i.e., P(x) ∈ GF(2⁴)), e.g. P(x) = x⁴+x¹+x⁰. This modulus operation on a group is also denoted as GF(2⁴) / P(x). For your reference, P(x) describes the 'taps' within the LFSR.

We also take a random polynomial of degree 3 or lower (so that it's not affected by our modulus, otherwise we could just as well perform the modulus operation directly on it), e.g. A₀(x) = x⁰. Now every subsequent A_i(x) is calculated by multiplying it with x: A_i(x) = A_i-1(x) * x mod P(x).

Since we are in a limited field, the modulus operation may have an effect, but only when the resulting A_i(x) has at least a factor x⁴ (our highest factor in P(x)). Note that, since we are working with numbers in Z₂, performing the modulus operation itself is nothing more than determining whether every a_i becomes a 0 or 1 by adding the two values from P(x) and A_i(x) together (i.e., 0+0=0, 0+1=1, 1+1=0, or 'xoring' these two).

Every polynomial can be written as a set of bits, for example x⁴+x¹+x⁰ ~ 10011. The A₀(x) can be seen as the seed. The 'times x' operation can be seen as a shift left operation. The modulus operation can be seen as a bit masking operation, with the mask being our P(x).

The algorithm depicted above therefore generates (an infinite stream of) valid four bit LFSR patterns. For example, for our defined A₀(x) (x⁰) and P(x) (x⁴+x¹+x⁰), we can define the following first yielded results in GF(2⁴) (note that A₀ is not yielded until at the end of the first round -- mathematicians generally start counting at '1'):

 i   Ai(x)                   'x⁴'  bit pattern
 0   0x³ + 0x² + 0x¹ + 1x⁰   0     0001        (not yielded)
 1   0x³ + 0x² + 1x¹ + 0x⁰   0     0010
 2   0x³ + 1x² + 0x¹ + 0x⁰   0     0100
 3   1x³ + 0x² + 0x¹ + 0x⁰   0     1000
 4   0x³ + 0x² + 1x¹ + 1x⁰   1     0011        (first time we 'overflow')
 5   0x³ + 1x² + 1x¹ + 0x⁰   0     0110
 6   1x³ + 1x² + 0x¹ + 0x⁰   0     1100
 7   1x³ + 0x² + 1x¹ + 1x⁰   1     1011
 8   0x³ + 1x² + 0x¹ + 1x⁰   1     0101
 9   1x³ + 0x² + 1x¹ + 0x⁰   0     1010
10   0x³ + 1x² + 1x¹ + 1x⁰   1     0111
11   1x³ + 1x² + 1x¹ + 0x⁰   0     1110
12   1x³ + 1x² + 1x¹ + 1x⁰   1     1111
13   1x³ + 1x² + 0x¹ + 1x⁰   1     1101
14   1x³ + 0x² + 0x¹ + 1x⁰   1     1001
15   0x³ + 0x² + 0x¹ + 1x⁰   1     0001        (same as i=0)

Note that your mask must contain a '1' at the fourth position to make sure that your LFSR generates four-bit results. Also note that a '1' must be present at the zeroth position to make sure that your bitstream would not end up with a 0000 bit pattern, or that the final bit would become unused (if all bits are shifted to the left, you would also end up with a zero at the 0th position after one shift).

Not all P(x)'s necessarily are generators for GF(2^k) (i.e., not all masks of k bits generate all 2^k-1-1 numbers). For example, x⁴ + x³ + x² + x¹ + x⁰ generates 3 groups of 5 distinct polynomals each, or "3 cycles of period 5": 0001,0010,0100,1000,1111; 0011,0110,1100,0111,1110; and 0101,1010,1011,1001,1101. Note that 0000 can never be generated, and can't generate any other number.

Usually, the output of an LFSR is the bit that is 'shifted' out, which is a '1' if the modulus operation is performed, and a '0' when it isn't. LFSR's with a period of 2^k-1-1, also called pseudo-noise or PN-LFSR's, adhere to Golomb's randomness postulates, which says as much as that this output bit is random 'enough'.

Sequences of these bits therefore have their use in cryptography, for instance in the A5/1 and A5/2 mobile encryption standards, or the E0 Bluetooth standard. However, they are not as secure as one would like: the Berlekamp-Massey algorithm can be used to reverse-engineer the characteristic polynomal (the P(x)) of the LFSR. Strong encryption standards therefore use Non-linear FSR's or similar non-linear functions. A related topic to this are the S-Boxes used in AES.

Note that I have used the int.bit_length() operation. This was not implemented until Python 2.7.
If you'd only like a finite bit pattern, you could check whether the seed equals the result and then break your loop.
You can use my LFSR-method in a for-loop (e.g. for xor, pattern in lfsr(0b001,0b10011)) or you can repeatedly call the .next() operation on the result of the method, returning a new (xor, result)-pair everytime.

There are many applications of LFSRs. One of them is generating noise, for instance the SN76489 and variants (used on the Master System, Game Gear, MegaDrive, NeoGeo Pocket, ...) use a LFSR to generate white/periodic noise. There's a really good description of SN76489's LFSR in this page.

To make it really elegant and Pythonic, try to create a generator, yield-ing successive values from the LFSR. Also, comparing to a floating point 0.0 is unnecessary and confusing.

A LFSR is just one of many ways to create pseudo-random numbers in computers. Pseudo-random, because there numbers aren't really random - you can easily repeat them by starting with the seed (initial value) and proceeding with the same mathematical operations.

Below is a variation on your code using integers and binary operators instead of strings. It also uses yield as someone suggested.

def lfsr2(seed, taps):
    sr = seed
    nbits = 8
    while 1:
        xor = 1
        for t in taps:
            if (sr & (1<<(t-1))) != 0:
                xor ^= 1
        sr = (xor << nbits-1) + (sr >> 1)
        yield xor, sr
        if sr == seed:
            break

nbits = 8
for xor, sr in lfsr2(0b11001001, (8,7,6,1)):
    print xor, bin(2**nbits+sr)[3:]

If we assume that seed is a list of ints rather than a string (or convert it if it is not) then the following should do what you want with a bit more elegance:

def lfsr(seed, taps) :
  while True:
    nxt = sum([ seed[x] for x in taps]) % 2
    yield nxt
    seed = ([nxt] + seed)[:max(taps)+1]

Example :

for x in lfsr([1,0,1,1,1,0,1,0,0],[1,5,6]) :
  print x

list_init=[1,0,1,1]
list_coeff=[1,1,0,0]
out=[]
for i in range(15):
    list_init.append(sum([list_init[i]*list_coeff[i] for i in range(len(list_init))])%2)
    out.append(list_init.pop(0))
print(out)

#https://www.rocq.inria.fr/secret/Anne.Canteaut/encyclopedia.pdf

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow