Practical Applications of Radix Sort

https://cs.stackexchange.com/questions/12223

16-10-2019
|

Pergunta

Radix sort is theoretically very fast when you know that the keys are in a certain limited range, say $n$ values in the range $[0\dots n^k -1]$ for example. If $k<\lg n$ you just convert the values to base $n$ which takes $\Theta(n)$ time, do a base $n$ radix sort and then convert back to your original base for an overall $\Theta(nk)$ algorithm.

However, I've read that in practice radix sort is typically much slower than doing for example a randomized quicksort:

For large arrays, radix sort has the lowest instruction count, but because of its relatively poor cache performance, its overall performance is worse than the memory optimized versions of mergesort and quicksort.

Is radix sort just a nice theoretical algorithm, or does it have common practical uses?

Solução

Radix sorts are often, in practice, the fastest and most useful sorts on parallel machines.

On each node of the multiprocessor you probably do something like a quicksort, but radix sort allows multiple nodes to work together with less synchronization than the various recursive sorts.

There are other situations too. If you need a stable sort (a sort where whenever two keys are equal they stay in the same order rather than getting rearranged) then I'm not aware of any version of quicksort that will be of use. Mergesort is also stable (if implemented correctly). Your link is the first time I've ever heard anyone say that mergesort could be made to have better cache behavior than radix sort.

Outras dicas

@Robert: Your link is quite surprising (actually I could not find the quoted sentence). My personal experience is for random input, radix sort is much faster than the STL std::sort(), which uses a variant of quicksort. I used to make an algorithm 50% faster by replacing std::sort() with an unstable radix sort. I am not sure what is the "memory optimized version" of quicksort, but I doubt it can be twice as fast as the STL version.

This blog post evaluated radix sort along with several other sorting algorithms. Briefly, in this evaluation, std::sort() takes 5.1 sec to sort 50 million integers, while in-place/unstable radix sort takes 2.0 sec. Stable radix sort should be even faster.

Radix sort is also widely used for stably sorting strings. Variants of radix sort are at times seen for constructing suffix arrays, BWT, etc.

Radix sort is also a natural way to sort fixed-length words over a fixed alphabet, e.g. in Kärkkäinen & Sanders algorithm (http://www.cs.cmu.edu/~guyb/realworld/papersS04/KaSa03.pdf)

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange