Question

I've been looking into neon optimisation with intrinsics recently and I have come across the poly8_t and poly16_t data types. I'm then left wondering what on earth they are.

I've searched all across the net but so far have been unable to find ANY explanation of what they are.

Can anyone explain them to me?

Edit: Thanks for those answers but why, if it is just a different way of multiplying etc, does it have a totally different data type?

Was it helpful?

Solution

Left = regular multiplication, Right = carryless multiplication

        1 1 0 1                              1 1 0 1
     *  1 0 0 1                              1 0 0 1
   ------------        -->              --------------
     (1)1 1 0 1  <-- (1) is carry            1 1 0 1
      0 0 0 0                              0 0 0 0 
    0 0 0 0                              0 0 0 0
  1 1 0 1        +                     1 1 0 1         + GF(2) or XOR
  -------------                        ---------------
  1 1 1 0 1 0 1                        1 1 0 0 1 0 1

Each 1 or 0 in the diagonally descending matrix represents partial product of one source bit from the vector '1101' and one source bit from the other vector '1001'.

The applications of the right one are in CRC, (ECC) Error Correction Code calculations (Reed Solomon, BCH) and cryptography (elliptic curves, internals of AES).

Illustrating the connection to polynomial multiplication, the operation above can summarized as

 1101 == x^3 + x^2 + 0 + 1;
 1001 == x^3 + 0   + 0 + 1;

Regular polynomial multiplication being: p(x) * (x^3 + 1) == p(x)*x^3 + p(x) ==

 (x^3 + x^2 + 1)(x^3 + 1) == x^6+x^5+x^3 + x^3+x^2+1 
                          == 1x^6 + 1x^5 + 0x^4 + 2x^3 + 1^x2 + 0x + 1
                          == "1102101"

In GF(2) each coefficient is simply calculated modulo 2, making 1100101b.

The datatype in GF looks just like uint8_t, uint16_t or perhaps upto 128_t in respect that the datatype for GF(2^8) holds 256 unique bitpatterns. However e.g. the bitpattern '00010001' e.g. has no traditional interpretation. (It's not 17 decimal, but perhaps 123th power of "unity" modulo some other polynomial.) Multiplying this number with the same "unity" modulo the generator polynomial g(x) leads to 124th power and so on. Then the properties (identities) of the finite fields have just interesting applications -- such that one can (remotely) easily calculate what 32-bit number to append to a file to make it's 32-bit CRC to match; or one can use the properties to parallelize crc calculation, or to implement bignum multiplication with Fourier-like transform in Finite fields (Number Theoretic Transform).

OTHER TIPS

These types are used for carry-less multiplication. It is useful for cryptographic algorithms and CRC hash sums. Here are some white papers about applications (they explore x86 PCLMULQDQ instruction, but the same ideas apply to carry-less multiplication on ARM processors):

You did not describe PMUL vs PMULL.

As I understand it (probably incorrectly) each per element PMUL works on two 8 bit source elements and generates one 8-bit result element.

Each per element PMUL generates 8 partial products and each PP are shifted respectively before XORed. So from the lsb of the first PP to the msb of the last PP. There seem to be 15-bits of result. PMUL can only store an 8-bit result.

Are the most significant 7-bits of 15-bit result discarded?

As a reference, this is from Cortex-A Series Programmer’s Guide (v4) chapter 7.2.2:

Polynomial arithmetic is useful when implementing certain cryptography or data integrity algorithms.

Adding two polynomials over {0,1} is the same as a bitwise exclusive OR. Polynomial additional results in different values to a conventional addition.

Multiplying two polynomials over {0,1} involves first determining the partial products as done in conventional multiply, then the partial products are exclusive ORed instead of being added conventionally. Polynomial multiplication results in different values to conventional multiplication because it requires polynomial addition of the partial products.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top