Pergunta

I'm having trouble with a gcc inline asm statement; gcc seems to think the result is a constant (which it isn't) and optimizes the statement away. I think I am using the operand constraints correctly, but would like a second opinion on the matter. If the problem is not in my use of constraints, I'll try to isolate a test case for a gcc bug report, but that may be difficult as even subtle changes in the surrounding code cause the problem to disappear.

The inline asm in question is

static inline void
ularith_div_2ul_ul_ul_r (unsigned long *r, unsigned long a1,
                 const unsigned long a2, const unsigned long b)
{
  ASSERT(a2 < b); /* Or there will be quotient overflow */
  __asm__(
            "# ularith_div_2ul_ul_ul_r: divq %0 %1 %2 %3\n\t"
            "divq %3"
            : "+a" (a1), "=d" (*r)
            : "1" (a2), "rm" (b)
            : "cc");
}

which is a pretty run-of-the-mill remainder of a two-word dividend by a one-word divisor. Note that the high word of the input, a2, and the remainder output, *r, are tied to the same register %rdx by the "1" constraint.

From the surrounding code, ularith_div_2ul_ul_ul_r() gets effectively called as if by

if (s == 1)
  modpp[0].one = 0;
else
  ularith_div_2ul_ul_ul_r(&modpp[0].one, 0UL, 1UL, s);

so the high word of the input, a2, is the constant 1UL. The resulting asm output of gcc -S -fverbose_asm looks like:

(earlier:)
        xorl    %r8d, %r8d      # cstore.863
(then:)
        cmpq    $1, -208(%rbp)  #, %sfp
        movl    $1, %eax        #, tmp841
        movq    %rsi, -184(%rbp)        # prephitmp.966, MEM[(struct __modulusredcul_t *)&modpp][0].invm
        cmovne  -208(%rbp), %rcx        # prephitmp.966,, %sfp, prephitmp.966
        cmovne  %rax, %r8       # cstore.863,, tmp841, cstore.863
        movq    %r8, -176(%rbp) # cstore.863, MEM[(struct __modulusredcul_t *)&modpp][0].one

The effect is that the result of the ularith_div_2ul_ul_ul_r() call is assumed to be the constant 1; the divq never appears in the output.

Various changes make the problem disappear; different compiler flags, different code context or marking the asm block __asm__ __volatile__ (...). The output then correctly contains the divq instruction:

#APP
        # ularith_div_2ul_ul_ul_r: divq %rax %rdx %rdx -208(%rbp)       # a1, tmp590, tmp590, %sfp
        divq -208(%rbp) # %sfp
#NO_APP

So, my question to the inline assembly guys here: did I do something wrong with the contraints?

Foi útil?

Solução

The bug affects only Ubuntu versions of gcc; the stock GNU gcc is unaffected as far as we can tell. The bug was reported to Ubuntu launchpad and confirmed: Bug #1029454

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top