Question

When we compile any c code with gcc -c and do objdump -d <filename>.o we see

Disassembly of section .text:
0000000000000000 <main>:
 0:   55                      push   %rbp
 1:   48 89 e5                mov    %rsp,%rbp
 4:   48 83 ec 10             sub    $0x10,%rsp
 8:   48 8d 45 fc             lea    -0x4(%rbp),%rax
 c:   48 89 c7                mov    %rax,%rdi
 f:   b8 00 00 00 00          mov    $0x0,%eax
 . . .

But after linking, the offset changes to gcc -o prog -L/library/path -llibrary *.o

0000000000400644 <main>:
400644:       55                      push   %rbp
400645:       48 89 e5                mov    %rsp,%rbp
400648:       48 83 ec 10             sub    $0x10,%rsp
40064c:       48 8d 45 fc             lea    -0x4(%rbp),%rax
400650:       48 89 c7                mov    %rax,%rdi
400653:       b8 00 00 00 00          mov    $0x0,%eax

How the offset is calculated after linking is done?

We basically get 3 set of addresses, 1. After compiling 2. After linking 3. After Loading

How the above address are related?

Was it helpful?

Solution 2

gcc -c would run the compiler proper (some cc1) which produces assembly code, then the assembler (as). Check with gcc -v -c to understand what exactly is happening.

The compiler (actually cc1) is translating your C code into assembly code.

Then the assembler (as) is translating that assembly code into an object file (in Executable & Linkable Format, a.k.a. ELF).

An ELF object file contains segments of bytes (e.g. code or .text segment or .data segment) with relocation instructions or directives and defines (& uses) symbolic references. Relocation is processor dependent, see e.g. this list of relocation types and the x86-64 ABI spec - for application binary interface. These relocations are processed by the linker (ld, started by gcc). Read Levine's Linkers and loaders book. So ELF object files contain bytes with relocation instructions and symbol tables.

The linker would modify some machine instructions (or other data) according to relocation codes.

OTHER TIPS

You have to remember that the object file contains only your code, so it will always be at offset zero.

When you link you add modules from other sources, like the runtime-initialization and library functions. You don't know the size of these objects, or where they will be placed in the resulting executable files, and therefore can't calculate the offset to the different parts of your code yourself. Also, if you have multiple object files, the linker may rearrange them as it sees fit.

What the exact virtual address the code will end up in when running, depends partly on the linker, but mostly on the operating system and things like address-space randomization and such.

Each platform/architecture has its specific memory layout that it expects for executables. Its the task of the linker to combine/relocate object files in a way that fits your platform/architecture.

With gcc toolchains, this adaption is done through linker scripts. If you are curious, you can peek into the implementation (default link scripts usually live in a directory called ldscripts with .xbn file extensions).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top