Question

My main looks like this:

int main(int argv,char **argc) 
{
     extern system,puts; 
     void (*fn)(char*)=(void(*)(char*))&system;
     char buf[256];

     fn=(void(*)(char*))&puts;
     strcpy(buf,argc[1]);
     fn(argc[2]);
     exit(1);
}

I have a general idea of what the extern keyword means, and how it works in this code. I'm confused with the declaration of the fn function. Is a function pointer declared here, or a function? Also, why is &system and &puts at the end of the definition of fn.

Was it helpful?

Solution

This code is a highly contrived example of how a buffer overflow can be used to call a function that the code was not originally designed to call. It should not be taken as an example of well written code, quite the opposite in fact.

The following line declares a pointer to a function and initializes it to point to the system function.

 void (*fn)(char*)=(void(*)(char*))&system;

Note that as written, the code never actually calls the system function, because the following line changes the pointer to point to the fputs function.

 fn=(void(*)(char*))&puts;

The vulnerability in the program is in this line

strcpy(buf,argc[1]);

If the strlen of argc[1] is greater than the buffer size, then it's possible for the buffer overrun to change the value fn to point to some arbitrary function, which will then be called by this line

fn(argc[2]);

Side note: as someone pointed out in the comments, the names of argc and argv should be switched.

OTHER TIPS

First let me say that this is horrible way of coding.

That said, let's see how it works:

extern system, puts;

This says that system and puts are defined somewhere else, and the linker would take care of giving you their address. The compiler doesn't know what type they have, so, gcc at least assumes it's an int and issues a warning. I can't find the clause in the standard that says whether this is well-defined, implementation-defined, or undefined behavior, but for sure it's bad. It's especially bad because those symbols are not ints. I did find this though (C11, Annex J.2 - Undefined behavior):

Two declarations of the same object or function specify types that are not compatible (6.2.7).

There is also the possibility that int is too small to hold a pointer. There is no end to how much this is bad.

So, let's see in memory how this looks:

system (as 4-byte int)           system (as function)
  BYTE1                             INSTRUCTION1
  BYTE2                             INSTRUCTION2
  BYTE3                             INSTRUCTION3
  BYTE4                             INSTRUCTION4
                                    INSTRUCTION5
                                    INSTRUCTION6
                                    INSTRUCTION7
                                    INSTRUCTION8
                                    ...

So what the linker does is, to make the address of system the same on each compiled object (effectively linking them together). The contents of the system on the other hand depends on its type which the linker doesn't know or care about.

So in your file, if right after:

extern system, puts;

you write:

printf("%d\n", system);

you will get the first couple instructions of the system function as if it were an int and print it. Which is bad, and don't do it.

On the other hand, as I said, the address of system as defined as extern in your program is the same as the real system function. So if you take the address:

&system

and cast it to its correct type:

(void(*)(char*))&system

you get a function pointer, pointing to the real system function. Note that system was just now an int, and we are about to call it as a function, this is by itself undefined behavior:

C11 Annex J-2 (undefined behavior):

An object has its stored value accessed other than by an lvalue of an allowable type (6.5).

C11 6.5 (emphasis mine):

6. The effective type of an object for an access to its stored value is the declared type of the object, if any. ...

7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88)

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

Getting passed all of this, you now have a function pointer containing the real address of system. You then call it to your heart's content. The same is done with puts.


That was bad. This is how this is actually supposed to be done:

#include <stdlib.h>
#include <stdio.h>

int main(int argv,char **argc) 
{
     char buf[256];

     /* system was unused */

     strcpy(buf,argc[1]);
     puts(argc[2]);

     return 0; /* 0 means no error */
}

Besides the goodness of using the implementation-provided header files, let's see how this one instead actually works.

Inside the header files, you have many lines with many declarations. The one for puts could look like this:

int puts(const char *s);

First, note how the function pointer in your code actually got the function signature wrong? That is bad. It's bad because calling a void function, the caller doesn't need to provide a return value space for it, while puts expects one to be there, so puts will write the return value where it expects to find it, but in reality that space belongs to something else. This is called stack corruption.

Anyway, when the compiler sees a line like this:

int puts(const char *s);
  1. it will know that puts is an external definition:

    C11 6.2.2 (emphasis mine):

    5. If the declaration of an identifier for a function has no storage-class specifier, its linkage is determined exactly as if it were declared with the storage-class specifier extern. If the declaration of an identifier for an object has file scope and no storage-class specifier, its linkage is external.

  2. it will know it's a function, and what its signature is.

So just by calling fputs(...) the compiler does everything correctly.

The declaration for fn is not declaring a function, it's declaring a function pointer, and initializing it. This line:

void (*fn)(char*)=(void(*)(char*))&system;

Is equivalent to:

void (*fn)(char*);
fn = (void(*)(char*)) &system;

Which declares fn to be a pointer to a function receiving a pointer to char (and returning nothing). Then, this pointer is assigned the address of system() casted to the appropriate type to match fn's type.

extern system, puts;
// equivalent to
extern int system, puts;

The above statement declares two variable system and puts of type int which is the default type when it is not specified (gcc emits a warning in this regard). Declaration means it introduces the variable name and type but does not allocate space for it.

Is a function pointer declared here, or a function?

void (*fn)(char*); 

defines fn to be a pointer to a function which takes a char * and returns no value. The expression

(void(*)(char*))&system

takes the address of the variable system and casts it to a pointer of type void(*)(char *), i.e., a pointer to a function which takes a char * and returns no value. Thus the statement

void (*fn)(char*)=(void(*)(char*))&system;

assigns the address of system after casting it to proper type to the pointer fn.

why is &system and &puts at the end of the definition of fn

That's because fn is a function pointer. During the linking phase of compilation, the identifiers system and puts are linked against the library functions. This means the address of the variables must be cast to appropriate type before assigning to the function pointer fn.

However, it is not the correct way to do it. The function prototypes must be included by using the header files they are contained in.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top