Question

I want to implement the equivalent of C's uint-to-double cast in the GHC Haskell compiler. We already implement int-to-double using FILD or CVTSI2SD. Is there unsigned versions of these operations or am I supposed to zero out the highest bit of the uint before the conversion (thus losing range)?

Was it helpful?

Solution 2

You can exploit some of the properties of the IEEE double format and interpret the unsigned value as part of the mantissa, while adding some carefully crafted exponent.

Bits 63 62-52     51-0
     S  Exp       Mantissa
     0  1075      20 bits 0, followed by your unsigned int

The 1075 comes from the IEEE exponent bias (1023) for doubles and a "shift" amount of 52 bits for your mantissa. Note that there is a implicit "1" leading the mantissa, which needs to be subtracted later.

So:

double uint32_to_double(uint32_t x) {
    uint64_t xx = x;
    xx += 1075ULL << 52;         // add the exponent
    double d = *(double*)&xx;    // or use a union to convert
    return d - (1ULL << 52);     // 2 ^^ 52
}

If you don't have native 64 bit on you platform a version using SSE for the integer steps might be beneficial, but that depends of course.

On my platform this compiles to

0000000000000000 <uint32_to_double>:
   0:   48 b8 00 00 00 00 00    movabs $0x4330000000000000,%rax
   7:   00 30 43 
   a:   89 ff                   mov    %edi,%edi
   c:   48 01 f8                add    %rdi,%rax
   f:   c4 e1 f9 6e c0          vmovq  %rax,%xmm0
  14:   c5 fb 5c 05 00 00 00    vsubsd 0x0(%rip),%xmm0,%xmm0 
  1b:   00 
  1c:   c3                      retq

which looks pretty good. The 0x0(%rip) is the magic double constant, and if inlined some instructions like the upper 32 bit zeroing and the constant reload will vanish.

OTHER TIPS

As someone said, "Good Artists Copy; Great Artists Steal". So we can just check how other compiler writers solved this issue. I used a simple snippet:

volatile unsigned int x;
int main()
{
  volatile double  y = x;
  return y;
}

(volatiles added to ensure the compiler does not optimize out the conversions)

Results (irrelevant instructions skipped):

Visual C++ 2010 cl /Ox (x86)

  __real@41f0000000000000 DQ 041f0000000000000r ; 4.29497e+009

  mov   eax, DWORD PTR ?x@@3IC          ; x
  fild  DWORD PTR ?x@@3IC           ; x
  test  eax, eax
  jns   SHORT $LN4@main
  fadd  QWORD PTR __real@41f0000000000000
$LN4@main:
  fstp  QWORD PTR _y$[esp+8]

So basically the compiler is adding an adjustment value in case the sign bit was set.

Visual C++ 2010 cl /Ox (x64)

  mov   eax, DWORD PTR ?x@@3IC          ; x
  pxor  xmm0, xmm0
  cvtsi2sd xmm0, rax
  movsdx    QWORD PTR y$[rsp], xmm0

No need to adjust here because the compiler knows that rax will have the sign bit cleared.

Visual C++ 2012 cl /Ox

  __xmm@41f00000000000000000000000000000 DB 00H, 00H, 00H, 00H, 00H, 00H, 00H
  DB 00H, 00H, 00H, 00H, 00H, 00H, 00H, 0f0H, 'A'

  mov   eax, DWORD PTR ?x@@3IC          ; x
  movd  xmm0, eax
  cvtdq2pd xmm0, xmm0
  shr   eax, 31                 ; 0000001fH
  addsd xmm0, QWORD PTR __xmm@41f00000000000000000000000000000[eax*8]
  movsd QWORD PTR _y$[esp+8], xmm0

This uses branchless code to add 0 or the magic adjustment depending on whether the sign bit was cleared or set.

There is a better way

__m128d _mm_cvtsu32_sd(__m128i n) {
    const __m128i magic_mask = _mm_set_epi32(0, 0, 0x43300000, 0);
    const __m128d magic_bias = _mm_set_sd(4503599627370496.0);
    return _mm_sub_sd(_mm_castsi128_pd(_mm_or_si128(n, magic_mask)), magic_bias);
}

We already implement int-to-double using FILD ...
Is there unsigned versions of these operations

If you want exactly x87 FILD opcode to use, just shift uint64 to uint63 (div 2) and then mul it by 2 back, but already as double, so the x87 uint64-to-double conversion requires one FMUL execution in overhead.

The example: 0xFFFFFFFFFFFFFFFFU -> +1.8446744073709551e+0019

it was unable to post the code example in the strict form rules. I'll try later.

    //inline
    double    u64_to_d(unsigned _int64 v){

    //volatile double   res;
    volatile unsigned int tmp=2;
    _asm{
    fild  dword ptr tmp
    //v>>=1;
    shr   dword ptr v+4, 1
    rcr   dword ptr v, 1
    fild  qword ptr v

    //save lsb
    //mov   byte ptr tmp, 0  
    //rcl   byte ptr tmp, 1

    //res=tmp+res*2;
    fmulp st(1),st
    //fild  dword ptr tmp
    //faddp st(1),st 

    //fstp  qword ptr res
    }

    //return res;
    //fld  qword ptr res
}

VC produced x86 output

        //inline
        double    u64_to_d(unsigned _int64 v){
    55                   push        ebp  
    8B EC                mov         ebp,esp  
    81 EC 04 00 00 00    sub         esp,04h  

        //volatile double   res;
        volatile unsigned int tmp=2;
    C7 45 FC 02 00 00 00 mov         dword ptr [tmp], 2  
        _asm{
        fild  dword ptr tmp
    DB 45 FC             fild        dword ptr [tmp]  
        //v>>=1;
        shr   dword ptr v+4, 1
    D1 6D 0C             shr         dword ptr [ebp+0Ch],1  
        rcr   dword ptr v, 1
    D1 5D 08             rcr         dword ptr [v],1  
        fild  qword ptr v
    DF 6D 08             fild        qword ptr [v]  

        //save lsb
    //    mov   byte ptr [tmp], 0  
    //C6 45 FC 00        mov         byte ptr [tmp], 0
    //    rcl   byte ptr tmp, 1
    //D0 55 FC           rcl         byte ptr [tmp],1  

        //res=tmp+res*2;
        fmulp st(1),st
    DE C9                fmulp       st(1),st  
    //    fild  dword ptr tmp
    //DB 45 FC           fild        dword ptr [tmp]  
    //    faddp st(1),st 
    //DE C1              faddp       st(1),st  


        //fstp  qword ptr res
        //fstp        qword ptr [res]  
    }

        //return res;
        //fld         qword ptr [res]  

    8B E5                mov         esp,ebp  
    5D                   pop         ebp  
    C3                   ret  
}

i posted (probably i manually removed all incorrected ascii chars in text file).

If I'm understanding you correctly you should be able to move your 32-bit uint to a temp area on stack, zero out the next dword, then use fild qword ptr to load the now 64-bit unsigned integer as a double.

Before AVX-512, x86 doesn't have unsigned <-> FP instructions.
(With AVX-512F, see vcvtusi2sd and vcvtsd2usi, and their respective ss versions. Also packed SIMD conversions involving 64-bit integers which is also new; before AVX-512F, packed conversions conversions could go to/from int32_t.)


In 64-bit code, unsigned 32-bit -> FP is easy: just zero-extend u32 to i64 and use signed 64-bit conversion. Every uint32_t value is representable as a non-negative int64_t.

For the reverse direction, convert FP -> i64 and truncate to u32, if you're ok with what happens for out-of-range FP inputs. (Including 0 when out-of-range for i64, otherwise taking the low32 of the 2's complement i64 bit pattern.)


u32 -> FP: See @Igor Skochinsky's answer for compiler output. x86-64 GCC and Clang use the same trick as x64 MSVC. The key part is to zero-extend it to 64-bit and convert. Note that writing a 32-bit register implicitly zero-extends to 64-bit, so you may not need the mov r32, r32 if you know the value was written with a 32-bit operation. (Or if you have to load it from memory yourself).

; assuming your input starts in EDI, and that RDI might have garbage in the high half
; like a 32-bit function arg.

    mov     eax, edi              ; mov-elimination wouldn't work with  edi,edi
    vcvtsi2sd xmm0, xmm7, rax     ; where XMM7 is some cold register to avoid a false dep

The choice of anything other than mov edi,edi (if you need a separate instruction for zero-extension) is motivated by mov-elimination not working in the same,same register case: see Can x86's MOV really be "free"? Why can't I reproduce this at all?.

If you don't have AVX, or don't know a not-recently-written register to use, you may want to use pxor xmm0, xmm0 before the poorly-designed cvtsi2sd merges into it. GCC breaks false deps religiously, clang is pretty cavalier unless a loop-carried dep chain would exist inside a single function. So it can be slowed down by interactions between separate non-inlined functions that might happen to get called in a loop. See Why does adding an xorps instruction make this function using cvtsi2ss and addss ~5x faster? for an example where this bites clang (but GCC is fine.)

That answer also links some GCC missed-optimization bug reports where I wrote more details about the idea of reusing a "cold" register to avoid false dependencies in conversion and stuff like [v]sqrtsd which is also a 1-input operation.


32-bit mode:

Different compilers have different strategies. gcc -O3 -m32 -mfpmath=sse -msseregparm is a good way to see what GCC does, making it return in XMM0 instead of ST0 so it only uses x87 when that's actually more convenient. (e.g. for 64-bit -> FP using fild).

I put some u32 and u64 -> float or double test functions on Godbolt with gcc and clang, but this answer is mostly aiming to answer the x86-64 part of the question which other answers didn't cover well, not obsolete 32-bit codegen. So I'm not going to copy the code and asm here and dissect it.

I will mention that double can exactly represent every u32, which allows a simple (double)(int)(u32 - 2^31) + double(2^31) trick to range-shift for signed conversion. But u32->float isn't so easy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top