Domanda

to get opcodes author here does following:

[bodo@bakawali testbed8]$ as testshell2.s -o testshell2.o
[bodo@bakawali testbed8]$ ld testshell2.o -o testshell2
[bodo@bakawali testbed8]$ objdump -d testshell2

and then he gets three sections (or mentions only these 3):

  • <_start>

  • < starter>

  • < ender>

I have tried to get hex opcodes the same way but cannot ld correctly. Of course I can produce .o and prog file for example with:

gcc main.o -o prog -g

however when

objdump --prefix-addresses --show-raw-insn -Srl prog

to see complete code with annotations and symbols, I have many additional sections there, for example:

  • .init

  • .plt

  • .text (yes, I know, main is here) [many parts here: _start(), call_gmon_start(), __do_global_dtors_aux(), frame_dummy(), main(), __libc_csu_init(), __libc_csu_fini(), __do_global_ctors_aux()]

  • .fini

I assume these are additions introduced by gcc linking to runtime libraries. I think i don't need these all sections to call opcode from c code (author uses only those 3 sections) however my problem is I don't know which exactly I might discard and which are necessary. I want to use it like this:

#include <unistd.h>

char code[] = "\x31\xed\x49\x89\x...x00\x00";

int main(int argc, char **argv)
{
/*creating a function pointer*/
int (*func)();
func = (int (*)()) code;
(int)(*func)();

return 0;
} 

so I have created this :

#include <unistd.h>
/*
 * 
 */
int main() {

    char *shell[2];

    shell[0] = "/bin/sh";
    shell[1] = NULL;
    execve(shell[0], shell, NULL);

    return 0;
}

and I did disassembly as I described. I tried to use opcode from .text main(), this gave me segmentation fault, then .text main() + additionally .text _start(), with same result.

So, what to choose from above sections, or how to generate only as minimized "prog" as with three sections?

È stato utile?

Soluzione 2

You should read this article: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html. It explains all you need to create really tiny program in great detail.

Altri suggerimenti

char code[] = "\x31\xed\x49\x89\x...x00\x00";

This will not work.

Reason: The code definitely contains adresses. Mainly the address of the function execve() and the address of the string constant "/bin/sh".

The executable using the "code[]" approach will not contain a string constant "/bin/sh" at all and the address of the function execve() will be different (if the function will be linked into the executable at all).

Therefore the "call" instruction to the "execve()" function will jump to anywhere in the executable using the "code[]" approach.

Some theory about executables - just for your information:

There are two possibilities for executables:

  • Statically linked: These executables contain all necessary code. Therefore they do not access dynamic libraries like "libc.so"
  • Dynamically linked: These executables do not contain code that is frequently used. Such code is stored in files common to all executables: The dynamic libraries (e.g. "libc.so")

When the same C code is used then statically linked executables are much bigger than dynamically linked executables because all C functions (e.g. "printf", "execve", ...) must be bundled into the executable.

When not using any of these library functions the statically linked executables are simpler and therefore easier to understand.

Statically linked executable behaviour

A statically linked executable is loaded into the memory by the operating system (when it is started using execve()). The executable contains an entry point address. This address is stored in the file header of the executable. You can see it using "objdump -h ...".

The operating system performs a jump to that address so the program execution starts at this address. The address is typically the function "_start" however this can be changed using command line options when linking using "ld".

The code at "_start" will prepare the executable (e.g. initialize variables, calculate the values for "argc" and "argv", ...) and call the "main()" function. When "main()" returns the "_start" function will pass the value returned by "main()" to the "_exit()" function.

Dynamically linked executable behaviour

Such executables contain two additional sections. The first section contains the file name of the dynamic linker (maybe. "/lib/ld-linux.so.1"). The operating system will then load the executable and the dynamic linker and jump to the entry point of the dynamic linker (and not to that of the executable).

The dynamic linker will read the second additional section: It contains information about dynamic libraries (e.g. "libc.so") required by the executable. It will load all these libraries and initialize a lot of variables. Then it calls the initialization function ("_init()") of all libraries and of the executable.

Note that both the operating system and the dynamic linker ignore the function and section names! The address of the entry point is taken from the file header and the addresses of the "_init()" functions is taken from the additional section - the functions may be named differently!

When all this is done the dynamic linker will jump to the entry point ("_start") of the executable.

About the "GOT", "PLT", ... sections:

These sections contain information about the addresses where the dynamic libraries have been loaded by the linker. The "PLT" section contains wrapper code that will contain jumps to the dynamic libraries. This means: The section "PLT" will contain a function "printf()" that will actually do nothing but jump to the "printf()" function in "libc.so". This is done because directly calling a function in a dynamic library from C code would make linking much more difficult so C code will not call functions in a dynamic library directly. Another advantage of this implementation is that "lazy linking" is possible.

Some words about Windows

Windows only knows dynamically linked executables. Windows XP even refused to load an executable not requiring DLLs. The "dynamic linker" is integrated into the operating system and not a separate file. There is also an equivalent of the "PLT" section. However many compilers support "directly" calling DLL code from C code without calling the code in the PLT section first (theoretically this would also be possible under Linux). Lazy linking is not supported.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top