Question

Background:

I am new to assembly. When I was learning programming, I made a program that implements multiplication tables up to 1000 * 1000. The tables are formatted so that each answer is on the line factor1 << 10 | factor2 (I know, I know, it's isn't pretty). These tables are then loaded into an array: int* tables. Empty lines are filled with 0. Here is a link to the file for the tables (7.3 MB). I know using assembly won't speed up this by much, but I just wanted to do it for fun (and a bit of practice).

Question:

I'm trying to convert this code into inline assembly (tables is a global):

int answer;
// ...
answer = tables [factor1 << 10 | factor2];

This is what I came up with:

asm volatile ( "shll $10, %1;"
           "orl %1, %2;"
           "movl _tables(,%2,4), %0;" : "=r" (answer) : "r" (factor1), "r" (factor2) );

My C++ code works fine, but my assembly fails. What is wrong with my assembly (especially the movl _tables(,%2,4), %0; part), compared to my C++

What I have done to solve it:

I used some random numbers: 89 796 as factor1 and factor2. I know that there is an element at 89 << 10 | 786 (which is 91922) – verified this with C++. When I run it with gdb, I get a SIGSEGV:

Program received signal SIGSEGV, Segmentation fault.

at this line:

"movl _tables(,%2,4), %0;" : "=r" (answer) : "r" (factor1), "r" (factor2) );

I added two methods around my asm, which is how I know where the asm block is in the disassembly.

Disassembly of my asm block:

The disassembly from objdump -M att -d looks fine (although I'm not sure, I'm new to assembly, as I said):

402096: 8b 45 08                mov    0x8(%ebp),%eax
402099: 8b 55 0c                mov    0xc(%ebp),%edx
40209c: c1 e0 0a                shl    $0xa,%eax
40209f: 09 c2                   or     %eax,%edx
4020a1: 8b 04 95 18 e0 47 00    mov    0x47e018(,%edx,4),%eax
4020a8: 89 45 ec                mov    %eax,-0x14(%ebp)

The disassembly from objdump -M intel -d:

402096: 8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
402099: 8b 55 0c                mov    edx,DWORD PTR [ebp+0xc]
40209c: c1 e0 0a                shl    eax,0xa
40209f: 09 c2                   or     edx,eax
4020a1: 8b 04 95 18 e0 47 00    mov    eax,DWORD PTR [edx*4+0x47e018]
4020a8: 89 45 ec                mov    DWORD PTR [ebp-0x14],eax

From what I understand, it's moving the first parameter of my void calc ( int factor1, int factor2 ) function into eax. Then it's moving the second parameter into edx. Then it shifts eax to the left by 10 and ors it with edx. A 32-bit integer is 4 bytes, so [edx*4+base_address]. Move result to eax and then put eax into int answer (which, I'm guessing is on -0x14 of the stack). I don't really see much of a problem.

Disassembly of the compiler's .exe:

When I replace the asm block with plain C++ (answer = tables [factor1 << 10 | factor2];) and disassemble it this is what I get in Intel syntax:

402096: a1 18 e0 47 00          mov    eax,ds:0x47e018
40209b: 8b 55 08                mov    edx,DWORD PTR [ebp+0x8]
40209e: c1 e2 0a                shl    edx,0xa
4020a1: 0b 55 0c                or     edx,DWORD PTR [ebp+0xc]
4020a4: c1 e2 02                shl    edx,0x2
4020a7: 01 d0                   add    eax,edx
4020a9: 8b 00                   mov    eax,DWORD PTR [eax]
4020ab: 89 45 ec                mov    DWORD PTR [ebp-0x14],eax

AT&T syntax:

402096: a1 18 e0 47 00          mov    0x47e018,%eax
40209b: 8b 55 08                mov    0x8(%ebp),%edx
40209e: c1 e2 0a                shl    $0xa,%edx
4020a1: 0b 55 0c                or     0xc(%ebp),%edx
4020a4: c1 e2 02                shl    $0x2,%edx
4020a7: 01 d0                   add    %edx,%eax
4020a9: 8b 00                   mov    (%eax),%eax
4020ab: 89 45 ec                mov    %eax,-0x14(%ebp)

I am not really familiar with the Intel syntax, so I am just going to try and understand the AT&T syntax:

It first moves the base address of the tables array into %eax. Then, is moves the first parameter into %edx. It shifts %edx to the left by 10 then ors it with the second parameter. Then, by shifting %edx to the left by two, it actually multiplies %edx by 4. Then, it adds that to %eax (the base address of the array). So, basically it just did this: [edx*4+0x47e018] (Intel syntax) or 0x47e018(,%edx,4) AT&T. It moves the value of the element it got into %eax and puts it into int answer. This method is more "expanded", but it does the same thing as my hand-written assembly! So why is mine giving a SIGSEGV while the compiler's working fine?

Was it helpful?

Solution

I bet (from the disassembly) that tables is a pointer to an array, not the array itself.

So you need:

 asm volatile ( "shll $10, %1;"
        movl  _tables,%%eax
       "orl %1, %2;"
       "movl (%%eax,%2,4)",
       : "=r" (answer) : "r" (factor1), "r" (factor2) : "eax" )   

(Don't forget the extra clobber in the last line).

There are of course variations, this may be more efficient if the code is in a loop:

 asm volatile ( "shll $10, %1;"
       "orl %1, %2;"
       "movl (%3,%2,4)",
       : "=r" (answer) : "r" (factor1), "r" (factor2), "r"(tables) )   

OTHER TIPS

This is intended to be an addition to Mats Petersson's answer - I wrote it simply because it wasn't immediately clear to me why OP's analysis of the disassembly (that his assembly and the compiler-generated one were equivalent) was incorrect.

As Mats Petersson explains, the problem is that tables is actually a pointer to an array, so to access an element, you have to dereference twice. Now to me, it wasn't immediately clear where this happens in the compiler-generated code. The culprit is this innocent-looking line:

a1 18 e0 47 00          mov    0x47e018,%eax

To the untrained eye (that includes mine), this might look like the value 0x47e018 is moved to eax, but it's actually not. The Intel-syntax representation of the same opcodes gives us a clue:

a1 18 e0 47 00          mov    eax,ds:0x47e018

Ah - ds: - so it's not actually a value, but an address!

For anyone who is wondering now, the following would be the opcodes and ATT syntax assembly for moving the value 0x47e018 to eax:

b8 18 e0 47 00          mov    $0x47e018,%eax
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top