Pergunta

I have two simple snippets:

a.c

extern int shared;

int main()
{
    int a = 100;
    swap(&a, &shared);
}

b.c

int shared = 1;

void swap(int *a, int *b)
{
    int temp = *a;
    *a = *b;
    *b = temp;
}

When I just use gcc a.c b.c then it works. But it fails when I use:

gcc -c a.c b.c
ld a.o b.o -e main -o ab

Running ./ab will cause a Segmentation fault error. And it seems that the linker can not resolve shared. The result of objdump -d ab approves:

  4000f0:   c7 45 fc 64 00 00 00    movl   $0x64,-0x4(%rbp)
  4000f7:   48 8d 45 fc             lea    -0x4(%rbp),%rax

You can see address of shared is still 00 00 00.

I think what I have done may lead to this error. What is it?

Thanks in advance.

Foi útil?

Solução

It's because the generated ELF is different from the standard (or atleast isn't compatible for running on your system), it misses quite a lot of standard ELF header sections. readelf can help you to find out the difference, for ex. do gcc -o ab a.c b.c then do readelf -S ./ab , you'll see there are 29 sections (as per my system). Also create another ELF through gcc -c a.c b.c and ld a.o b.o -e main -o ab2, then again use readelf -S ./ab2 now you'll see there's only 18 section (as per my case).

Now if you take close look, then you'll see there's quite a few sections those are missing - like. .init, .fini, __libc_start_main etc. Now if you take a look at objdump of both executable's then you'll see at the time of returning there's few particular things which is done at .fini section (gcc internal stuffs, like restoring stack). So, these are the things which are missing from a standard ELF which is produces by invoking gcc -o ab a.c b.c.

To make sure that the problem is at the time of returning from main, you can make sure this by using gdb, I've figured the problem by generating object code by gcc -g3 a.c b.c and ld a.o b.o -e main -o ab. You'll see that problem is happening when you're about to return from main. Hope this will give you some idea, why it's happening.

Outras dicas

If you want to see the command that gcc actually executes to run the ld command, you can use the -v (verbose) option. For example, for a single source file zigzag7.c, the compilation on Mac OS X 10.9.2 with GCC 4.8.1 produced:

$ gcc -v -std=c99 -o zigzag7 zigzag7.c
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/4.8.1/lto-wrapper
Target: x86_64-apple-darwin12.5.0
Configured with: ../gcc-4.8.1/configure --prefix=/usr/gcc/v4.8.1
Thread model: posix
gcc version 4.8.1 (GCC) 
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.9.1' '-v' '-std=c99' '-o' 'zigzag7' '-mtune=core2'
 /usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/4.8.1/cc1 -quiet -v -D__DYNAMIC__ zigzag7.c -fPIC -quiet -dumpbase zigzag7.c -mmacosx-version-min=10.9.1 -mtune=core2 -auxbase zigzag7 -std=c99 -version -o /var/folders/lj/kt7909lm8xj2tl001s6z265r0000gq/T//ccRar4iZ.s
GNU C (GCC) version 4.8.1 (x86_64-apple-darwin12.5.0)
    compiled by GNU C version 4.8.1, GMP version 5.1.3, MPFR version 3.1.2, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
ignoring nonexistent directory "/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/../../../../x86_64-apple-darwin12.5.0/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/include
 /usr/local/include
 /usr/gcc/v4.8.1/include
 /usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/include-fixed
 /usr/include
 /System/Library/Frameworks
 /Library/Frameworks
End of search list.
GNU C (GCC) version 4.8.1 (x86_64-apple-darwin12.5.0)
    compiled by GNU C version 4.8.1, GMP version 5.1.3, MPFR version 3.1.2, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 48814df7d2c1a0636e2a53e05ef4ed75
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.9.1' '-v' '-std=c99' '-o' 'zigzag7' '-mtune=core2'
 as -arch x86_64 -force_cpusubtype_ALL -o /var/folders/lj/kt7909lm8xj2tl001s6z265r0000gq/T//ccVxV9gX.o /var/folders/lj/kt7909lm8xj2tl001s6z265r0000gq/T//ccRar4iZ.s
COMPILER_PATH=/usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/4.8.1/:/usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/4.8.1/:/usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/:/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/:/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/
LIBRARY_PATH=/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/:/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/../../../:/usr/lib/
COLLECT_GCC_OPTIONS='-mmacosx-version-min=10.9.1' '-v' '-std=c99' '-o' 'zigzag7' '-mtune=core2'
 /usr/gcc/v4.8.1/libexec/gcc/x86_64-apple-darwin12.5.0/4.8.1/collect2 -dynamic -arch x86_64 -macosx_version_min 10.9.1 -weak_reference_mismatches non-weak -o zigzag7 -L/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1 -L/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/../../.. /var/folders/lj/kt7909lm8xj2tl001s6z265r0000gq/T//ccVxV9gX.o -no_compact_unwind -lSystem -lgcc_ext.10.5 -lgcc -lSystem -v
collect2 version 4.8.1
/usr/bin/ld -dynamic -arch x86_64 -macosx_version_min 10.9.1 -weak_reference_mismatches non-weak -o zigzag7 -L/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1 -L/usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1/../../.. /var/folders/lj/kt7909lm8xj2tl001s6z265r0000gq/T//ccVxV9gX.o -no_compact_unwind -lSystem -lgcc_ext.10.5 -lgcc -lSystem -v
@(#)PROGRAM:ld  PROJECT:ld64-224.1
configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 armv6m armv7m armv7em
Library search paths:
    /usr/gcc/v4.8.1/lib/gcc/x86_64-apple-darwin12.5.0/4.8.1
    /usr/gcc/v4.8.1/lib
    /usr/lib
    /usr/local/lib
Framework search paths:
    /Library/Frameworks/
    /System/Library/Frameworks/
$

Note that it includes a number of system libraries, and a whole bunch of other options and controls. While the details are almost certainly going to be different for your setup, the issues will be analogous; the gcc command adds a lot of options to the invocation of ld compared with your naïve invocation.

Though other answers are great and I have accepted one, I want to post my own research on it.

Why there is a Segmentation fault is that the ELF is incomplete.

Using the ld in that way will produce an incomplete ELF file. To see this, we can compare it with file produced by gcc. The former lacks many sections.

After main is executed, there is still something to be executed. But unfortunately, my ELF file does not get that section. So, something wrong will be executed and then generates an error. When I use gdb to debug it I find the error is produced after main.

And from another aspect I can approve my point, let's define our own exit function, so we will not return to the system after main and nothing will be executed then:

void exit()
{
    asm("movl $42, %ebx \n\t"
    "movl $1, %eax \n\t"
    "int $0x80 \n\t"
    );
}

And then add exit to main. Finally the file is able to run, with no exceptions.

One way could be:

  1. Manually set dynamic linker:

    -dynamic-linker /lib/ld-linux.so.2
    
  2. Include C runtime

    • crt1 : _start symbol (or crt0, crt2 ...)
    • crti : function prologs for the .init and .fini sections (_init and _fini symbols)
    • crtn : function epilogs for the .init/.fini sections

  3. Use -lc to link several object files

Final command (on my 32-bit Intel):

ld -o ab \
-dynamic-linker /lib/ld-linux.so.2 \
/usr/lib/i386-linux-gnu/crt1.o \
/usr/lib/i386-linux-gnu/crti.o \
/usr/lib/i386-linux-gnu/crtn.o \
a.o b.o -lc

Using the -v option for gcc as mentioned by Jonathan Leffler, or -### to only print. Should give you the paths for the above mentioned.


More in detail

crtN

This has the _start function (symbol) that initializes the process. It typically set up the environment for the initial execution of the program (bootsrap). Also argc, argv, env, stack / frame pointer etc. if not done by loader.

Initializes standard library. Memory handling, I/O etc.

Calls _init which calls global constructor functions.

Finally in the start sequence it calls main with argc and argv.

When main concludes _fini is called which call global destructor functions.


Minimalism

In this concrete example we can also do a minimal crt with (ref. OS Dev)

For i386:

.section .text

.global _start
_start:
        movl $0, %ebp
        pushl %ebp
        pushl %ebp
        movl %esp, %ebp
        call main
        movl %eax, %edi
        call exit

or even:

.section .text

.global _start
_start:
        call main
        call exit

For 64:

.section .text

.global _start
_start:
        movq $0, %rbp
        pushq %rbp
        pushq %rbp
        movq %rsp, %rbp
        call main
        movl %eax, %edi
        call exit

Build and run:

as crt_min.S -o crt_min.o
ld -o ab -dynamic-linker /lib/ld-linux.so.2 crt_min.o a.o b.o -lc
./ab

Bare minimum

To get rid of dynamic-linker we can use something like:

.global _start
_start:
        call main
        call exit
        nop
exit:
        movl $1, %eax    # sys_exit
        xorl %ebx, %ebx  # exit code
        int $0x80        # call kernel

Build and run:

as crt_min.S -o crt_min.o
ld -o ab crt_min.o a.o b.o 
./ab
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top