Pregunta

I lack the knowledge of how performance is affected by CPU specifications. I'm running an application to perform modular calculation (DH Key Exchange) on a Windows platform with the following parameters:

Modular: a prime number = 4096 bits

Generator: 2

Exponent: 256 bits

When the application is run on 32-bit Windows 7 with 2.4 GHz processor and 4G RAM, it takes between 3-4 seconds. However, when I run the same application on 64-bit Windows 7 with the same processor speed and 8G RAM, it takes between 1-2 seconds.

I'm trying to understand but I got confused whether the modular calculation speed is affected by the ARM size or CPU support (64-bit vs 32-bit)

¿Fue útil?

Solución

64 bit CPUs are significantly faster at big integer arithmetic than 32 bit CPUs. My experience is a factor 2 with identical code and a factor 4 with specialized code.

  • In code written with x86 in mind many intermediate values have 64 bits. For example if you multiply two 32 bit integers you get 64 bits, which then need to be added, shifted finally split into 32 bit integers.

    AMD64 (64 bit) CPUs have larger registers and more of them compared with x86 (32 bit) CPUs. So these intermediate values fit into a single register and the compiler doesn't need to stitch together two 32 bit registers to give the appearance of 64 bit integers in c. The additional registers mean you need to work with the stack less often.

    This improves the performance of such code about two fold over the same CPU in 32 bit mode.

  • Another important difference is that AMD64 (64 bit) supports a 64x64->128 bit multiplication and x86 (32 bit) only supports 32x32->64 bit multiplication. This big multiplication is about twice as expensive, but does 4x as much.

    This results in another factor 2 speedup if you write code that uses 128 bit integers to hold intermediate values.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top