Background:
I am new to assembly. When I was learning programming, I made a program that implements multiplication tables up to 1000 * 1000. The tables are formatted so that each answer is on the line factor1 << 10 | factor2
(I know, I know, it's isn't pretty). These tables are then loaded into an array: int* tables
. Empty lines are filled with 0. Here is a link to the file for the tables (7.3 MB). I know using assembly won't speed up this by much, but I just wanted to do it for fun (and a bit of practice).
Question:
I'm trying to convert this code into inline assembly (tables
is a global):
int answer;
// ...
answer = tables [factor1 << 10 | factor2];
This is what I came up with:
asm volatile ( "shll $10, %1;"
"orl %1, %2;"
"movl _tables(,%2,4), %0;" : "=r" (answer) : "r" (factor1), "r" (factor2) );
My C++ code works fine, but my assembly fails. What is wrong with my assembly (especially the movl _tables(,%2,4), %0;
part), compared to my C++
What I have done to solve it:
I used some random numbers: 89 796 as factor1
and factor2
. I know that there is an element at 89 << 10 | 786
(which is 91922
) – verified this with C++. When I run it with gdb
, I get a SIGSEGV:
Program received signal SIGSEGV, Segmentation fault.
at this line:
"movl _tables(,%2,4), %0;" : "=r" (answer) : "r" (factor1), "r" (factor2) );
I added two methods around my asm
, which is how I know where the asm
block is in the disassembly.
Disassembly of my asm
block:
The disassembly from objdump -M att -d
looks fine (although I'm not sure, I'm new to assembly, as I said):
402096: 8b 45 08 mov 0x8(%ebp),%eax
402099: 8b 55 0c mov 0xc(%ebp),%edx
40209c: c1 e0 0a shl $0xa,%eax
40209f: 09 c2 or %eax,%edx
4020a1: 8b 04 95 18 e0 47 00 mov 0x47e018(,%edx,4),%eax
4020a8: 89 45 ec mov %eax,-0x14(%ebp)
The disassembly from objdump -M intel -d
:
402096: 8b 45 08 mov eax,DWORD PTR [ebp+0x8]
402099: 8b 55 0c mov edx,DWORD PTR [ebp+0xc]
40209c: c1 e0 0a shl eax,0xa
40209f: 09 c2 or edx,eax
4020a1: 8b 04 95 18 e0 47 00 mov eax,DWORD PTR [edx*4+0x47e018]
4020a8: 89 45 ec mov DWORD PTR [ebp-0x14],eax
From what I understand, it's moving the first parameter of my void calc ( int factor1, int factor2 )
function into eax
. Then it's moving the second parameter into edx
. Then it shifts eax
to the left by 10 and or
s it with edx
. A 32-bit integer is 4 bytes, so [edx*4+base_address]
. Move result to eax
and then put eax
into int answer
(which, I'm guessing is on -0x14
of the stack). I don't really see much of a problem.
Disassembly of the compiler's .exe
:
When I replace the asm
block with plain C++ (answer = tables [factor1 << 10 | factor2];
) and disassemble it this is what I get in Intel syntax:
402096: a1 18 e0 47 00 mov eax,ds:0x47e018
40209b: 8b 55 08 mov edx,DWORD PTR [ebp+0x8]
40209e: c1 e2 0a shl edx,0xa
4020a1: 0b 55 0c or edx,DWORD PTR [ebp+0xc]
4020a4: c1 e2 02 shl edx,0x2
4020a7: 01 d0 add eax,edx
4020a9: 8b 00 mov eax,DWORD PTR [eax]
4020ab: 89 45 ec mov DWORD PTR [ebp-0x14],eax
AT&T syntax:
402096: a1 18 e0 47 00 mov 0x47e018,%eax
40209b: 8b 55 08 mov 0x8(%ebp),%edx
40209e: c1 e2 0a shl $0xa,%edx
4020a1: 0b 55 0c or 0xc(%ebp),%edx
4020a4: c1 e2 02 shl $0x2,%edx
4020a7: 01 d0 add %edx,%eax
4020a9: 8b 00 mov (%eax),%eax
4020ab: 89 45 ec mov %eax,-0x14(%ebp)
I am not really familiar with the Intel syntax, so I am just going to try and understand the AT&T syntax:
It first moves the base address of the tables
array into %eax
. Then, is moves the first parameter into %edx
. It shifts %edx
to the left by 10 then or
s it with the second parameter. Then, by shifting %edx
to the left by two, it actually multiplies %edx
by 4. Then, it adds that to %eax
(the base address of the array). So, basically it just did this: [edx*4+0x47e018]
(Intel syntax) or 0x47e018(,%edx,4)
AT&T. It moves the value of the element it got into %eax
and puts it into int answer
. This method is more "expanded", but it does the same thing as my hand-written assembly! So why is mine giving a SIGSEGV
while the compiler's working fine?