Question

Given the program below, segfault() will (As the name suggests) segfault the program by accessing 256k below the stack. nofault() however, gradually pushes below the stack all the way to 1m below, but never segfaults.

Additionally, running segfault() after nofault() doesn't result in an error either.

If I put sleep()s in nofault() and use the time to cat /proc/$pid/maps I see the allocated stack space grows between the first and second call, this explains why segfault() doesn't crash afterwards - there's plenty of memory.

But the disassembly shows there's no change to %rsp. This makes sense since that would screw up the call stack.

I presumed that the maximum stack size would be baked into the binary at compile time (In retrospect that would be very hard for a compiler to do) or that it would just periodically check %rsp and add a buffer after that.

How does the kernel know when to increase the stack memory?

#include <stdio.h>
#include <unistd.h>

void segfault(){
  char * x;
  int a;
  for( x = (char *)&x-1024*256; x<(char *)(&x+1); x++){
    a = *x & 0xFF;
    printf("%p = 0x%02x\n",x,a);
  }
}

void nofault(){
  char * x;
  int a;
  sleep(20);
  for( x = (char *)(&x); x>(char *)&x-1024*1024; x--){
    a = *x & 0xFF;
    printf("%p = 0x%02x\n",x,a);
  }
  sleep(20);
}

int main(){
  nofault();
  segfault();
}
Was it helpful?

Solution

The processor raises a page fault when you access an unmapped page. The kernel's page fault handler checks whether the address is reasonably close to the process's %rsp and if so, it allocates some memory and resumes the process. If you are too far below %rsp, the kernel passes the fault along to the process as a signal.

I tried to find the precise definition of what addresses are close enough to %rsp to trigger stack growth, and came up with this from linux/arch/x86/mm.c:

/*
 * Accessing the stack below %sp is always a bug.
 * The large cushion allows instructions like enter
 * and pusha to work. ("enter $65535, $31" pushes
 * 32 pointers and then decrements %sp by 65535.)
 */
if (unlikely(address + 65536 + 32 * sizeof(unsigned long) < regs->sp)) {
        bad_area(regs, error_code, address);
        return;
}

But experimenting with your program I found that 65536+32*sizeof(unsigned long) isn't the actual cutoff point between segfault and no segfault. It seems to be about twice that value. So I'll just stick with the vague "reasonably close" as my official answer.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top