Question

Intel® 64 and IA-32 Architectures Optimization Reference Manual lists latency and throughput figures for various CPU instructions.

For transcendental functions (FSIN etc) some of the figures are listed as ranges (page C-29). Footnote 4 explains:

Latency and Throughput of transcendental instructions can vary substantially in a dynamic execution environment. Only an approximate value or a range of values are given for these instructions.

My question is: what factors affect the throughput and latency of such instructions? I imagine the value of the argument is one factor. Are there any other?

Était-ce utile?

La solution

Besides the argument, the mix of other instructions that are in flight may have an effect on the latency and throughput. These instructions are microcoded, which means they generate a sequence of µops which need to contend with other instructions for ALU resources; in case of such contention, performance may be adversely effected.

Autres conseils

The x87 control word specifies the accuracy of computations (64-bit, 53-bit, or 24-bit mantissa), and it can affect the performance of transcendental functions, especially those of them which internally use division or square root. In general, I advise to avoid using trigonometric x87 instructions because by design they are very inaccurate for large input values.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top