Question

Here is the pseudocode for LSD radix sort from http://www.cs.princeton.edu/~rs/AlgsDS07/18RadixSort.pdf

   public static void lsd(String[] a)
   {
        int N = a.length;
        int W = a[0].length;
        for (int d = W-1; d >=0; d--)
        {
             int[] count = new int[R];
             for(int i = 0; i < N; i++)
                   count[a[i].charAt(d) + 1]++;
             for(int k = 1; k < 256; k++)
                   count[k] += count[k-1];
             for(int i = 0; i < N; i++)
                   temp[count[a[i].charAt(d)]++] = a[i];
             for(int i = 0; i < N; i++)
                   a[i] = temp[i];
        }
   }

I guess that by 256 they mean $R$. We have a simple for loop, so the time is $\Theta(W(R+N))$ The reason why I use $\Theta$ is because the bound is tight, this is how many operations you will be doing no matter what, so it's worst case and best case performance at the same time. The space requirements are $\Theta(N+R)$

The following is the pseudocode of MSD:

    public static void msd(String[] a)
    { 
        msd(a, 0, a.length, 0); 
    }

    private static void msd(String[] a, int lo, int hi, int d)
    {
        if (hi <= lo+1) return;
        int[] count = new int[256+1];
        for (int i = 0; i < N; i++)
             count[a[i].charAt(d)+1]++;
        for (int k = 1; k < 256; k++)
             count[k] += count[k-1];
        for (int i = 0; i < N; i++)
             temp[count[a[i].charAt(d)]++] = a[i];
        for (int i = 0; i < N; i++)
             a[i] = temp[i];
        for (int i = 0; i < 255; i++)
             msd(a, l+count[i], l+count[i+1], d+1);
   }

Since the recursion we will stop recursing after $W$ levels in the worst case, the space complexity becomes $\Theta(N + WR)$

however what about time? Just from space itself, it seems like MSD is actually pretty bad. What can you say about the time? To me it looks like you will have to spend $R$ operations for every node in your recursion tree and also $N$ operations per every level. So the time is $\Theta(NW + R*amountOfNodes)$.

I do not see how and why MSD would ever be preferred in practice given these time bounds. They are pretty horrible, but I am still not sure whether I get the time bounds in a correct way, for instance I have no clue how big is amountOfNodes..

Keep in mind that I would like the analysis to be in terms of R and N. I know that in practice R is usually small so it's just a constant, but I am not interested in this scenario, since I would like to see how much does R affect the performance as well.

I asked a kind of similar question yesterday but I did receive any replies, I think the reason for that was because my question was not well structured, so I could remove this question if the admins wanted. Thank you in advance!

No correct solution

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top