Question

Is there some instruction in MIPS that will determine the parity of a certain bit representation? I know to determine whether a "number" has an even parity or an odd parity is to XOR the individual bits of the binary representation together, but that seems computationally-intensive for a set of MIPS instructions... and I need to do this as quick as possible.

Also, the number I'm working in is represented in Grey Code... just to throw that in there. So is there some pseudo-instruction in MIPS to determine the parity of a "number" or do I have to do it by hand?

If there is no MIPS instruction, which it seems very unlikely to be, any advice on how to do it by hand?

Thanks, Hristo

follow-up: I found a optimization, but my implementation isn't working.

unsigned int v; // 32-bit word
v ^= v >> 1;
v ^= v >> 2;
v = (v & 0x11111111U) * 0x11111111U;
return (v >> 28) & 1;
Was it helpful?

Solution

I'm not aware of any MIPS variant with a parity instruction, but there's a bit-twiddling trick for calculating parity faster than the obvious method of running through each of the 32 bits in turn. In C:

result = in ^ (in >> 16);
result ^= (result >> 8);
result ^= (result >> 4);
result ^= (result >> 2);
result ^= (result >> 1);
result &= 1;
  • After the first step, the bottom 16 bits of the result contain the parity of bits N and N+16 of the input - essentially, 16 steps of the parity calculation have been performed in one go. Writing result{N} to mean "bit N of result":

    result{0}  =  in{0} ^ in{16}
    result{1}  =  in{1} ^ in{17}
    result{2}  =  in{2} ^ in{18}
    ...
    result{7}  =  in{7} ^ in{23}
    result{8}  =  in{8} ^ in{24}
    ...
    result{15} = in{15} ^ in{31}
    

    (and the remaining top 16 bits of result can now be ignored; they serve no useful purpose in the remainder of the calculation).

  • After the second step, the bottom 8 bits of result contain the parity of bits N, N+8, N+16, N+24 of the original input:

    result{0} = result{0} ^ result{8}  =  in{0} ^  in{8} ^ in{16} ^ in{24}
    result{1} = result{1} ^ result{9}  =  in{1} ^  in{9} ^ in{17} ^ in{25}
    ...
    result{7} = result{7} ^ result{15} =  in{7} ^ in{15} ^ in{23} ^ in{31}
    

    (and again, the remaining bits can be ignored from here onwards).

  • ...and so on, until the parity of all the bits of the original input ends up in the bottom bit of result:

    result{0} = in{0} ^ in{1} ^ in{2} ^ ... ^ in{30} ^ in{31}
    

This is easy to translate directly to MIPS assembly; it's 11 instructions:

# input in $a0, output in $v0, $t0 corrupted
srl $t0, $a0, 16
xor $v0, $a0, $t0
srl $t0, $v0, 8
xor $v0, $v0, $t0
srl $t0, $v0, 4
xor $v0, $v0, $t0
srl $t0, $v0, 2
xor $v0, $v0, $t0
srl $t0, $v0, 1
xor $v0, $v0, $t0
and $v0, $v0, 1

A possible improvement might be to use a lookup table. For example, after the first two steps, we have:

    result{0} =  in{0} ^  in{8} ^ in{16} ^ in{24}
    result{1} =  in{1} ^  in{9} ^ in{17} ^ in{25}
    ...
    result{7} =  in{7} ^ in{15} ^ in{23} ^ in{31}

so we could use a 256-byte lookup table at this point. In C:

result = in ^ (in >> 16);
result ^= (result >> 8);
result = lookup_table[result & 0xff];

where lookup_table[n] has been precalculated, e.g.:

for (i = 0; i < 256; i++) {
    n = i ^ (i >> 4);
    n ^= (n >> 2);
    n ^= (n >> 1);
    lookup_table[i] = n & 1;
}

This is 7 MIPS instructions, not counting loading the lookup table base address into a register:

# input in $a0, lookup table address in $a1, output in $v0, $t0 corrupted
srl  $t0, $a0, 16
xor  $v0, $a0, $t0
srl  $t0, $v0, 8
xor  $v0, $v0, $t0
andi $v0, $v0, 0xff
addu $t0, $a1, $v0
lbu  $v0, 0($t0)

However, this is 7 instructions which include a memory access, versus 11 instructions which are purely register operations; it may or may not be faster. (This sort of micro-optimisation always needs to be profiled!)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top