Question

Below is an LSD Radix sort implementation in Java from a textbook to sort an array of strings where each string contains exactly W characters.

I want to count the number of array accesses during runtime. I've read that LSD sort is supposed to require n * c array accesses where n is the number of strings and c the amount of characters in each string. However, the algorithm below accesses more than one array several times. If I increment a counter at each of these I'll end up with a significant factor of nc.

So what exactly constitutes 'array access' in the context of algorithms? Is there only one instance of array access that is considered more significant that I should count here, or is this example in fact an inefficient implementation that uses more array access than necessary?

 public int lsdSort(String[] array, int W) {
  int access = 0;
  // Sort a[] on leading W characters.
  int N = array.length;
  String[] aux = new String[N];
  for (int d = W-1; d >= 0; d--)
  { // Sort by key-indexed counting on dth char.
    int[] count = new int[R+1]; // Compute frequency counts.
    for (int i = 0; i < N; i++) {
        count[array[i].charAt(d) + 1]++;
    }
    for (int r = 0; r < R; r++) {
        // Transform counts to indices.
        count[r+1] += count[r];
    }
    for (int i = 0; i < N; i++) {
        // Distribute.
        aux[count[array[i].charAt(d)]++] = array[i];

    }  
    for (int i = 0; i < N; i++) // Copy back.
        array[i] = aux[i];
  }

  return access;
  }
Was it helpful?

Solution

I've read that LSD sort is supposed to require n times c array accesses where n is the number of strings and c the amount of characters in each string.

Are you sure you haven't read that it's O(nc)? That's not the same thing at all. It's big-O notation. The point isn't to determine the exact number of array accesses - it's to talk about how it grows (or rather, the limit of how it can grow) as n or c increase. In this case it increases linearly - if you increase n by a factor of 1000, you would only expect the total cost to grow by a factor of 1000 too... whereas if it were an O(n2c) algorithm instead, it might grow by a factor of 1,000,000. (Strictly speaking any O(nc) algorithm is also O(n2c) due to it only being a limit, but let's not get into that.)

OTHER TIPS

All the array accesses inside the first for loop are essentially counted as a combined number of array accesses, so that's your c. The n is how many times you do these combined array accesses. This gives you an approximate idea of the growth of the function, not the actual count of accesses.

public int lsdSort(String[] array, int W) {
  int access = 0;
  // Sort a[] on leading W characters.
  int N = array.length;
  String[] aux = new String[N];
  for (int d = W-1; d >= 0; d--)
  { // Sort by key-indexed counting on dth char.
    int[] count = new int[R+1]; // Compute frequency counts.
    for (int i = 0; i < N; i++) {
        count[array[i].charAt(d) + 1]++;
        access++;
        access++;
    }
    for (int r = 0; r < R; r++) {
        // Transform counts to indices.
        count[r+1] += count[r];
        access++;
    }
    for (int i = 0; i < N; i++) {
        // Distribute.
        aux[count[array[i].charAt(d)]++] = array[i];
        access++; 
        access++;
        access++;   
    }  
    for (int i = 0; i < N; i++) // Copy back.
        array[i] = aux[i];
        access++;
        access++;
  }

  return access;

  }

an array 'access' is either a read or a write...

In Big-O asymptotic notation, the number of access are proportional to a constant. When you analyze the code, all the constants are discarded.

In the case of Radix Sort the Big O is O(cn). But if you want to actually count how many times the array is accessed you need to multiply that number for some constant k, where that k is specific to the concrete coded implementation.

For example, this function is O(n) but the number of times the array is accessed is 2n: one for reading the value, and one for updating it. The number 2 is discarded.

for (i=0; i<N; i++)
   A[i] = A[i] + 1;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top