Question

I played around with C and different sorting algorithms. Instead of generating a random array to sort, I decided first to try and declare an automatic array of some size (say, 100) without initializing it. I expected that the operating system (I'm on Windows XP) will put zeros in there, otherwise that could be some data left by some other process. But it turned out, there are no zeros in such uninitialized array. So the first question is: what data could it be and why isn't it cleaned by the OS? Could it be the garbage of the "cmd" shell I am running the compiler and the program in?

Next, since the sorting algorithm I tried wasn't allocating any more memory and was just sorting all in place, I thought there would be two possibilities: either (a) that memory area which was allocated for the array will become sorted and stay that way on the next run, or (b) on the next run the OS will give the program another part of the memory and therefore another garbage. But it turned out, it is neither. The memory stays the same no matter how many times I sort it. So, the second question is: how is that possible that I get some random virtual memory area, sort it, and on the next run I get the same area in its original form?

Below are the code and my steps.

#include <stdio.h>
#define SIZE 100

void quick_sort( int *a, size_t size )
{
     /* No comments here, everybody knows quicksort
     and anyone will write it better! :) */

     if( size <= 1 ) return;

     int tmp, pivot;
     int small_length, pivot_pos;

     pivot_pos = size / 2;
     pivot = a[pivot_pos];
     tmp = a[0];
     a[0] = pivot;
     a[pivot_pos] = tmp;

     small_length = 0;

     for( int i = 1; i < size; i++ )
     {
      if( a[i] < pivot ){
           small_length++;
           tmp = a[small_length];
           a[small_length] = a[i];
           a[i] = tmp;
      }
     }

     a[0] = a[small_length];
     a[small_length] = pivot;

     quick_sort( a, small_length );
     quick_sort( a + small_length + 1, size - small_length - 1 );
}

int main( int argc, char *argv[] )
{
     int a[SIZE];
     int i;

     quick_sort( a, SIZE );
     for( i = 0; i < SIZE; i++ ) {
        printf( "%d\n", a[i] );
     }
     return 0;
}

Steps I used:

  1. Compile with quick_sort call commented, to see the unsorted unitialized array: cc sort.c -Wall -std=c99 && a > unsorted1 (the compiled executable is a.exe).
  2. Uncomment the quick_sort call and see the sorted result: cc ... > sorted.
  3. Comment the call again to see the unsorted array again: cc ... > unsorted2.
  4. Compare files unsorted1 and unsorted2 - they are the same, all 100 lines.

I tried also starting another instance of "cmd", and the data is the same.

Updates

I. Hex dump of unsorted data:

0000000 0000 0000 fd7c 0022 fd74 0022 fdd8 0022
0000020 0170 0000 1448 7c91 ffff ffff 1440 7c91
0000040 0000 0024 13d2 7c91 301c 0024 3008 0024
0000060 0010 0000 b988 7c97 0000 003e 0007 0000
0000100 fd24 0022 0000 0000 fe24 0022 e900 7c90
0000120 0040 0001 002e 0000 0000 003e eeeb 7c80
0000140 fe30 0022 e900 7c90 0040 7c91 ffff ffff
0000160 003d 0001 0002 0000 fd5c 0022 0000 0000
0000200 fe50 0022 0002 0000 0002 0000 ffff ffff
0000220 003d 7c91 c2de 77c1 0000 003e 0000 0000
0000240 c2e3 77c1 a122 0040 0000 0000 2bc4 003e
0000260 2070 77c0 ffff ffff 0000 003e 2bc0 003e
0000300 0004 0000 2373 0024 1660 0040 fea0 0022
0000320 c024 77c1 0000 003e 0000 0000 c02d 77c1
0000340 1660 0040 2373 0024 1dc0 0040 fec0 0022
0000360 c024 77c1 0000 003e 0000 0000 c02d 77c1
0000400 1dc0 0040 2373 0024 0000 0000 5c94 77c2
0000420 a52e 77c2 1ae8 77c5 fedc 0022 9d60 77c2
0000440 fe88 0022 4e2f 77c2 feec 0022 5c94 77c2
0000460 a52e 77c2 1ae8 77c5 fefc 0022 9d60 77c2
0000500 0008 0000 4e2f 77c2 4e29 77c2 ff3c 0022
0000520 2373 0024 0000 0000 1dc0 0040 fed4 0022
0000540 ff08 0022 ffe0 0022 5c94 77c2 2850 77c0
0000560 ffff ffff 4e29 77c2 4e42 77c2 1dc0 0040
0000600 ffa0 0022 1e1e 0040 1dc0 0040 0020 0000

To produce the dump I had to add a FILE, fopen(...), fclose(...) to the code, and that changed the output. But these calls go after the array declaration, so still not sure how that affects the contents of the array.

II. Tried to increase the SIZE to 10000 and saw the zeros finally. The output was many zeros, followed by that familiar garbage right at the end. I'm not sure at what direction the memory grows, I expected that data to stay at the beginning of the array...

Was it helpful?

Solution

Modern, multi-user/multi-process operating systems always hand over clean memory pages to processes that request memory (so that sensitive data doesn't leak from one process to another). Note that this "clean" contract applies only when the operating system initially hands the memory over to the process. If the process uses the memory for one thing initially and then for something else later, the memory does not get automatically re-cleaned. This is what is happening here - your uninitialized variable is on the stack, and the stack has already been used by the C runtime startup code before main runs. You're seeing whatever junk the startup code left on the stack. If you're curious exactly what it is, you can read the Microsoft Visual C Runtime Library source code (it ships with Visual Studio) or look at the disassembly of your program's entry point in a disassembler.

The same applies to the heap. When malloc, VirtualAlloc, etc. first get pages from the operating system, they will be clean. But when you call free, VirtualFree, etc. the runtime library is not obligated to give the pages back to the operating system and re-request them. It can just hold on to them (without cleaning them) and use them to satisfy a later request.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top