x86_64 Assembly Linux System Call Confusion

https://stackoverflow.com/questions/8510333

16-03-2021
|

Question

I am currently learning Assembly language on Linux. I have been using the book 'Programming From the Ground Up' and all the examples are 32-bit. My OS is 64-bit and I have been trying to do all the examples in 64-bit. I am having trouble however:

.section .data

.section .text
.global _start
_start:
movq $60, %rax
movq $2, %rbx
int $0x80

This merely just calls the Linux exit System call or it should. Instead it causes a SEG FAULT and when I instead do this

.section .data

.section .text
.global _start
_start:
movq $1, %rax
movq $2, %rbx
int $0x80

it works. Clearly the problem is the value I move to %rax. The value $1 that I use in the second example is what 'Programming From the Ground Up' said to use however multiple sources on the Internet have said that the 64-bit System Call Number is $60. Reference What am I doing wrong? Also what other issues should I watch out for and what should I use for a reference? Just in case you need to know, I am on Chapter 5 in Programming From The Ground Up.

Solution

You're running into one surprising difference between i386 and x86_64: they don't use the same system call mechanism. The correct code is:

movq $60, %rax
movq $2,  %rdi   ; not %rbx!
syscall

Interrupt 0x80 always invokes 32-bit system calls. It's used to allow 32-bit applications to run on 64-bit systems.

For the purposes of learning, you should probably try to follow the tutorial exactly, rather than translating on the fly to 64-bit -- there are a few other significant behavioral differences that you're likely to run into. Once you're familiar with i386, then you can pick up x86_64 separately.

OTHER TIPS

please read this What are the calling conventions for UNIX & Linux system calls on x86-64

and note that using int 0x80 for syscall on x64 systems is an old compatibility layer. you should use syscall instruction on x64 systems.

you can still use this old method, but you need to compile your binaries in a x86 mode, see your compiler/assembler manual for details.

duskwuff's answer points out correctly the mechanism for system calls is different for 64-bit x86 Linux versus 32-bit Linux.

However, this answer is incomplete and misleading for a couple reasons:

The change was actually introduced before 64-bit systems became popular, motivated by the observation that int 0x80 was very slow on Pentium 4. Linus Torvalds coded up a solution using the SYSENTER/SYSEXIT instructions (which had been introduced by Intel around the Pentium Pro era, but which were buggy and gave no practical benefit). So modern 32-bit Linux systems actually use SYSENTER, not int 0x80.
64-bit x86 Linux kernels do not actually use SYSENTER and SYSEXIT. They actually use the very similar SYSCALL/SYSRET instructions.

As pointed out in the comments, SYSENTER does not actually work on many 64-bit Linux systems—namely 64-bit AMD systems.

It's an admittedly confusing situation. The gory details are here, but what it comes down to is this:

For a 32bit kernel, SYSENTER/SYSEXIT are the only compatible pair [between AMD and Intel CPUs]

For a 64bit kernel in Long mode only… SYSCALL/SYSRET are the only compatible pair [between AMD and Intel CPUs]

It appears that on an Intel CPU in 64-bit mode, you can get away with using SYSENTER because it does the same thing as SYSCALL, however this is not the case for AMD systems.

Bottom line: always use SYSCALL on Linux on 64-bit x86 systems. It's what the x86-64 ABI actually specifies. (See this great wiki answer for even more details.)

Quite a lot has changed between i386 and x86_64 including both the instruction used to go into the kernel and the registers used to carry system call arguments. Here is code equivalent to yours:

.section .data

.section .text
.global _start
_start:
movq $60, %rax
movq $2, %rdi
syscall

Quoting from this answer to a related question:

The syscall numbers are in the Linux source code under arch/x86/include/asm/unistd_64.h. The syscall number is passed in the rax register. The parameters are in rdi, rsi, rdx, r10, r8, r9. The call is invoked with the "syscall" instruction. The syscall overwrites the rcx register. The return is in rax.

If you check /usr/include/asm/unistd_32.h exit corresponds to 1 but in /usr/include/asm/unistd_64.h exit corresponds to 60.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow