Question

My solution: (for every bit of the input block, there is such a line)

*parity ^= (((x[0] >> 30) & 0x00000001) * 0xc3e0d69f);

All types are uint32. This line takes the second bit of the input x, shifts it to the LSB and sets all other bits to zero. Then, the 32bit parity is XORed with the corresponding parity set for this bit.

I found that this multiplication solution is the fastest way to do this conditional XOR. Is there a faster way?

Was it helpful?

Solution

I do not completely understand what kind of parity you mean, but if this line of code is doing that you want, it may be improved.

General rule: for x in {0, 1} x * N == -x & N

this because -x for 0 is all bits reset and for 1 is -1 in which all bits set.

So original line of code may be rewritten as:

*parity ^= (-((x[0] >> 30) & 0x00000001) & 0xc3e0d69f);

What two operations computed in less time than multiplication on many microprocessors, but you should check this.

Also code may take advantage of signed shift right

*parity ^= (((int32_t)x[0] << 1 >> 31) & 0xc3e0d69f);

First shift rshifts 30th bit into 31st, which is sign bit, and then second extend sign bit on all others as shift right on most machines act as floor(x / 2N), thus fill shifted in bits with sign bit (abc...yz>>3 == aaaabc...yz).

But these tricks are stated as undefined behaviour in C standard and thus not portable. Use them carefully.

OTHER TIPS

See here for some neat hacks for calculating parity of a word, byte etc

Some processors will do this for you. See x86's parity flag.

If I understand the question correctly, you are doing

for (i = 0; i < 32; i++)
    *parity ^= (((x[0] >> i) & 1) * SOME_CONST[i]); 

If so, it's better to use lookup tables:

for (i = 0; i < 4; i++)
    *parity ^= PARITY_LUT[i][ (x[0] >> (i*8)) & 0xFF];

It would cost 256kb, but will be much faster.

Parity calculation task is equivalent of counting of ones. Also it called as "count set bits", "population cont" or simply popcount. Some of processors have efficient instruction to calculate it (POPCNT,VCNT).

I will suggest to use lowest bit of popcount.

It can be accessed by inline assembler or by using builtins: __builtin_popcount()/ __popcnt()/ std::bitset::count() for GCC/VS/C++

Personally I prefer to give this job to compiler by using __builtin_parity()

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top