Question

You have a large list (say N > 10000) of test scores which you would like to sort. The test scores are between 1 and 100. What is quickest way to sort the list?

First thought. We have a O(N log N) bound, but we also have additional information about the quantities in the array, so I think we could do better.

Second thought: should we use hash tables, do we care about duplicates? Cant see how to use hash tables.

Third thought: does this have something to do with radix sorting? no idea.

Fourth thought: Could we sort this list by creating another list, and then pass through the original counting frequencies of elements occured. But would we need another pass to create a larger sorted list, which would be O(N^2). ie too large.

Is this a very easy question or a very hard question?

Was it helpful?

Solution

This is a very easy question, assuming all scores are integers.

Here is the simplest algorithm in plain words. We will initiate count, an integer array of 100 zeros. For each score s, we will add 1 to count[s]. To produce the wanted sorted scores, we will output count[1] 1s, count[2] 2s, ..., and finally count[100] 100s.

This kind of sorting algorithm is called counting sort.

The case of more than $N>10000$ test scores that are between 1 and 100 is a prime usage of counting sort. The time complexity is $O(N)$ and the space complexity is bounded by some small constant multiple of 100.

You may want to check counting sort for more information.

OTHER TIPS

Yes, using sorting algorithms like merge sort we can achieve this by O(N*logN) we can do better here. The additional information given regarding the bound of test scores is very useful here.

do we care about duplicates ?

if we are dealing with just scores and doesn't care about the other information like student_name or subject_info and we just want the scores in sorted format the u can use this algorithm.

     maintain a int table[101] = {0}   - hash table with key as score
                        //all elements are initialised to 0

    array contains all scores
    for score in array
         table[score] = table[score] +1
         //above in O(N) time and O(1) space.

    sorted_list = {} // initially empty
    for (score= 0; score < 101;i++)
      for(count = 0; count < table[i]; count++)
          sorted_list.add(score)
          //the above runs in O(N) time and O(N) space.

Now if we care about the info if the score like student/subject which it belonged to use this below approach. i'm assuming you will store the score and related info in a c/c++ structure or any object format.

Now maintain a hash table of size 100 (range of test scores) key = score value = a list of objects or instances with this score (if you are sorting for a list of students then list of students with this score )

if you are familiar with c/c++ then this data structure can be implemented using array of linked lists.The hashing technique used here is separate hashing.

the data structure functionality is like this DS[score] has the pointer/reference to the linked list using a another hash map to identify the tails of each sub-lists in DS we can insert a new element in O(1) time.

so in a single pass from i =0 to i

after inserting we can create a new list with a single pass on DS we have created.

the algorithm is like this.

let array contains all objects with their respective scores

    for (i = 0; i< n; i++)
      key = array[i].score
      DS[key].insert(array[i]) //the tail part can be used for O(1) insertion.

     //the above loop runs in O(N)

     sorted_list = {} // empty_list
     for(score = 1; score<=100;score++)
       for each (obj in DS[score]) 
          sorted_list.add(obj)

      //the above loop runs in O(N).

     //the N refers to the size of original list here.

this approach magically is queue based radix sort of base 100. Read more about radix sort and counting sort with queue implementation.

from the question : "Fourth thought: Could we sort this list by creating another list, and then pass through the original counting frequencies of elements occurred. But would we need another pass to create a larger sorted list, which would be O(N^2). ie too large. "

i think you are mistaking another pass would change N to Nˆ2 . unless you are placing the 'another pass' in a loop it won't.

i hope i answered all your questions.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top