Question

I would like to demonstrate that sometime radix-sort is better than quick-sort. In this example I am using the program below:

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <string.h>
#include <time.h>
#include <math.h>

int cmpfunc (const void * a, const void * b) {
   return ( *(int*)a - *(int*)b );
}

void bin_radix_sort(int *a, const long long size, int digits) {
    assert(digits % 2 == 0);

    long long count[2];
    int *b = malloc(size * sizeof(int));
    int exp = 0;

    while(digits--) {
        // Count elements
        count[0] = count[1] = 0;
        for (int i = 0; i < size; i++)
            count[(a[i] >> exp) & 0x01]++;

        // Cumulative sum
        count[1] += count[0];

        // Build output array
        for (int i = size - 1; i >= 0; i--)
            b[--count[(a[i] >> exp) & 0x01]] = a[i];

        exp++;
        int *p = a; a = b; b = p;
    };

    free(b);
}

struct timespec start;

void tic() {
    timespec_get(&start, TIME_UTC);
}

double toc() {
    struct timespec stop;
    timespec_get(&stop, TIME_UTC);
    return stop.tv_sec - start.tv_sec + (
        stop.tv_nsec - start.tv_nsec
    ) * 1e-9;
}

int main(void)
{
    const long long n = 1024 * 1024 * 50;
    printf("Init memory (%lld MB)...\n", n / 1024 / 1024 * sizeof(int));

    int *data = calloc(n, sizeof(int));

    printf("Sorting n = %lld data elements...\n", n);

    long long O;
    tic();
    O = n * log(n);
    qsort(data, n, sizeof(data[0]), cmpfunc);
    printf("%lld %lf s\n", O, toc());

    int d = 6;
    tic();
    O = d * (n + 2);
    bin_radix_sort(data, n, d);
    printf("%lld %lf s\n", O, toc());
}

It performs as follow:

$ gcc bench.c -lm
$ ./a.out 
Init memory (200 MB)...
Sorting n = 52428800 data elements...
931920169 1.858300 s
314572812 1.541998 s

I know that Quick Sort will perform in O(n log n) while Radix Sort will be in O(d (n + r)) ~= O(6 * n). For n = 52428800, log(n) = 17. I am then expecting Radix Sort to be 3 times faster than Quick Sort...

This is not what I observe.

What am I missing?

Was it helpful?

Solution

  1. qsort is not Quicksort. qsort is whatever the implementor of the standard library decided.

  2. Sorting 50 million identical values is highly unrepresentative. There are qsort implementations that will sort 50 million identical, or 50 million sorted or reverse sorted elements in linear time.

  3. Radix sort of 50 million numbers with six digits is obviously nonsense. Six digits means you expect only 64 different values and therefore would use counting sort. With less than 26 digits it doesn't make sense.

  4. Your radix sort does an awful lot of copying, and worse, uncached copying.

  5. Your radix sort doesn't produce a correctly sorted result if digits is an odd number.

  6. Your qsort call doesn't produce a correctly sorted result if the array contains very large and very small numbers.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top