Domanda

I am compiling a fairly sophisticated application in two modes: Debug and Release. The main difference, as I see it, is -O0 vs -O3 (I can provide the relevant part of makefile if needed). I am trying to avoid syscall generation as much as possible, as I am simulating this application in syscall emulation mode (no OS running underneath). The problem that i am currently having is that in Release mode the compiler generates an extra socket syscall, which I prefer not to happen (and which does not happen in Debug mode).

The reason that I think the socket might be created is that I am using pthreads and two of my threads are communicating through a volatile char*. So I'm guessing maybe the compiler is trying to implement it in a fancy way when I set the -O3 flag? But I'm not sure if that is a reasonable assumption.

  1. Is it possible that the socket syscall is being generated because of the -O3 flag? (doesn't make too much sense)
  2. If so, how can I hint to the compiler to avoid generating this syscall?

EDIT: BTW the code is in C and C++

EDIT: The code is statically linked against the following shared libraries:

libstdc++.a 
libm.a 
libglib-2.0.a 
-static-libgcc 
*special pthreads library*

Also, I found where in the binary the call to socket is happening:

8c716:       db28            blt.n   8c76a <openlog_internal+0xf2>
8c718:       f8d9 1008       ldr.w   r1, [r9, #8]
8c71c:       4620            mov     r0, r4
8c71e:       2200            movs    r2, #0
8c720:       f441 2100       orr.w   r1, r1, #524288 ; 0x80000
8c724:       f001 e97c       blx     8da20 <__socket>
8c728:       4b20            ldr     r3, [pc, #128]  ; (8c7ac <openlog_internal+0x134>)
8c72a:       681b            ldr     r3, [r3, #0]
8c72c:       f8c9 0004       str.w   r0, [r9, #4]
8c730:       b943            cbnz    r3, 8c744 <openlog_internal+0xcc>
8c732:       1c43            adds    r3, r0, #1

EDIT: I found out why this is happening (see my answer below). If anyone has an explanation as to why the compiler behaves like that please share!!!

È stato utile?

Soluzione

Although, one can imagine such an optimization, I haven't heard of such and I really doubt it, because any system call is usually very expensive.

If you are on a *nix system, you can verify it by looking for undefined symbols with nm

nm -u file1.o file2.o | grep socket

should show somewhere the missing socket symbol as

        U socket

if there is somewhere a call to socket.

As I mentioned, I doubt, that there is an optimization inserting any system call and I expect no output from the command line above.

Update:

On my system (Ubuntu 12.04, gcc 4.6), I found the following note in man gcc

-O2 Optimize even more. ...
NOTE: In Ubuntu 8.10 and later versions, -D_FORTIFY_SOURCE=2 is set by default, and is activated when -O is set to 2 or higher. This enables additional compile- time and run-time checks for several libc functions. To disable, specify either -U_FORTIFY_SOURCE or -D_FORTIFY_SOURCE=0.

So, maybe through this or a similar mechanism, there is some code included when the optimization is set to -O2 or -O3.

Altri suggerimenti

After a whole day of debugging, it turns out that arm-cross-gcc compiles strcpy() differently under -O0 and -O1,-O2,-O3 when the string you are copying from is a volatile char*. -O0 compiles using standard user mode assembly, whereas -O1,-O2,-O3 compile it with extra syscalls, such as socket, connect, and send.

So, after all, my initial hunch is justified:

"The reason that I think the socket might be created is that I am using pthreads and two of my threads are communicating through a volatile char*. So I'm guessing maybe the compiler is trying to implement it in a fancy way when I set the -O3 flag? But I'm not sure if that is a reasonable assumption."

EDIT: Here are some observations to support this claim:

I compiled my code in 4 versions

 1. without strcpy() -O0 => obj.o0
 2. with    strcpy() -O0 => obj_strcpy.o0
 3. without strcpy() -O3 => obj.o3
 4. with    strcpy() -O3 => obj_strcpy.o3

I ran nm -u on all of the above.

Here are the diffs:

$> diff obj.o0 obj_strcpy.o0
$> diff obj.o3 obj_strcpy.o3
>          U __strcpy_chk
$>

This means that when you add strcpy() to your code and compile with -O0 no external symbols are added, whereas when you add strcpy() to your code and compile with -O3 U __strcpy_chk symbol is added to the object file. I will look into the implementation of U __strcpy_chk on ARM to figure out where the syscalls are coming from. As of right now it seems like U __strcpy_chk is doing buffer overflow checks - here is the reference:

http://refspecs.linux-foundation.org/LSB_4.0.0/LSB-Core-generic/LSB-Core-generic/libc---strcpy-chk-1.html

EDIT: So there are two solutions so far: one proposed by Olaf Dietsche to use another compiler flag in addition to -O3. The other option is to avoid strcpy() altogether and use something as follows:

for(int i=0;i<64;i++)
{
  cmd[i] = msg[i];
}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top