Question

I am trying to learn assembly from scratch. I have been reading up quite a bit, but even the following simple program I found in a reference book has me stumped:

section .data
msg db "Hello!", 0xa
len equ $ - msg
section .text

     global _start

_start:


move edx, len
move ecx, msg
move ebx, 1
move eax, 4
int  0x80
move ebx, 0
move eax, 1
int 0x80

Now apparently this is supposed to print "Hello". But I don't even know whats happening at any of the stages. The first two stages put the message length and messgae in two registers, which are never used again. I don't understand why.

I don't know why four different registers are needed.

Was it helpful?

Solution

int 80 is a mechanism in some*a UNIX-like operating systems for making system calls.

For these calls, the registers are used for specific values. From the syscalls file:

0 STD NOHIDE { int nosys(void); } syscall nosys_args int
1 STD NOHIDE { void exit(int rval); } exit rexit_args void
2 STD POSIX  { int fork(void); }
3 STD POSIX  { ssize_t read(int fd, void *buf, size_t nbyte); }
4 STD POSIX  { ssize_t write(int fd, const void *buf, size_t nbyte); }

you can see that number 4 is the write call and needs three other parameters. Number 1 is exit and needs only the return code.

When making the call, eax is the syscall that you're making while ebx, ecx and edx are the three parameters (assuming they're all needed - exit for example only needs one).

So, you can comment the code as follows:

move edx, len   ; length of message (nbyte).
move ecx, msg   ; message to print (buf).
move ebx, 1     ; file handle 1, stdout (fd).
move eax, 4     ; write syscall.
int  0x80       ; do it.

move ebx, 0     ; exit code (rval).
move eax, 1     ; exit syscall.
int 0x80        ; do it.

*a: Later versions of Linux introduced the a new interface which can use different methods based on which provides the best speed. For example, some Intel chips are much faster if you use sysenter rather than int 80.

OTHER TIPS

IIRC the int 0x80 instruction is used to invoke a syscall by using the interrupt vector. In your example the values in ebx and eax are used to specify which syscall you are gonna call (probably the print operation on stdout).

The syscall knows by convenction that edx and ecx should contain what is gonna be printed.

On many systems, int 80h is the system call gate. The syscall number is in eax. ebx, ecx and edx contain additional parameters:

move edx, len
move ecx, msg
move ebx, 1    ; fd 1 is stdout
move eax, 4    ; syscall 4 is write
int  0x80      ; write(1, msg, len)
move ebx, 0
move eax, 1    ; syscall 1 is exit
int 0x80       ; exit(0)

When you call a system call, the 'int' mnemonic, a system interruption is generated. It kinda "jumps" to a system function, which, in this case, prints output (depends on eax).

This interruption uses all those registers to know what to do. The interrupt reads eax, check what function you want and uses the other registers to do so.

eax is the function number, 4 means sys_write, which writes a string to a stream/file descriptor.

Now it knows you want to write something to some place, then it uses the other registers to those informations.

for eax = 4 and int 0x80 this is the meaning for the other registers:

ebx = output (1 = stdout)
ecx = address of the string
edx = length of the string

You can read this:

http://www.intel.com/Assets/ja_JP/PDF/manual/253665.pdf

Section 6.4 It has some stuff about Interruptins and Exceptions.

And you can start writing Intel 80x86 assembly code, which is simpler and easier to understand, here are some links:

Mnemonics/Code tables cheatsheet: http://www.jegerlehner.ch/intel/

Some introduction sites: http://mysite.du.edu/~etuttle/math/8086.htm http://www.malware.org/teaching/assembly.htm

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top