avx three operands for sqrt?

https://stackoverflow.com/questions/10735652

10-06-2021
|

Question

Why has the avx sqrt (non-packed) instruction three operands?

vsqrtsd xmm1, xmm2, xmm3

Does this mean something like xmm1=xmm2=sqrt(xmm3)?

Edit: Detailed answer below but in short the assembly line means:

xmm1.low  = sqrt(xmm3.low);
xmm1.high = xmm2.high;

La solution

Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 2B, Page 4-407, "SQRTSD—Compute Square Root of Scalar Double-Precision Floating- Point Value":

VSQRTSD xmm1, xmm2, xmm3/m64

Computes square root of the low double-precision floating point value in xmm3/m64 and stores the results in xmm1*. Also, upper double precision floating-point value (bits[127:64]) from xmm2 is copied to xmm1[127:64].

Operation
DEST[63:0] ← SQRT(SRC2[63:0])
DEST[127:64] ← SRC1[127:64]
DEST[VLMAX-1:128] ← 0

The instruction is simply following the pattern of other binary V___SD and V___SS operations like VSUBSD, which performs

DEST[63:0] ← SRC1[63:0] - SRC2[63:0]
DEST[127:64] ← SRC1[127:64]
DEST[VLMAX-1:128] ← 0

and like VRCPSS xmm1, xmm2, xmm3/32, which performs

DEST[31:0] ← APPROXIMATE(1/SRC2[31:0])
DEST[127:32] ← SRC1[127:32]
DEST[VLMAX-1:128] ← 0

The general form is like

xmm1.low = f(xmm2.low, xmm3.low);
xmm1.high = xmm2.high,

as described in Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1, §11.4.1 "Packed and Scalar Double-Precision Floating-Point Instructions". For VSQRTSD we just define f(x, y) = √y, ignoring the first operand.

*: Note: The Intel Manual writes "xmm2" here, which is an error.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow