Question

From reading this article from Wikipedia on sorting algorithms, it would seem that smoothsort is the best sorting algorithm there is. It has top performance in all categories: best, average, and worst. Nothing beats it in any category. It also has constant memory requirements. The only downside is that it isn't stable.

It beats timsort in memory, and it beats quicksort in both worst-case performance and memory.

But I never hear about smoothsort. Nobody ever mentions it, and most discussions seem to revolve around other sorting algorithms.

Why is that?

Was it helpful?

Solution

Big-O performance is great for publishing papers, but in the real world we have to look at the constants too. Quicksort has been the algorithm of choice for unstable, in-place, in-memory sorting for so long because we can implement its inner loop very efficiently and it is very cache-friendly. Even if you can implement smoothsort's inner loop as efficiently, or nearly as efficiently, as quicksort's, you will probably find that its cache miss rate makes it slower.

We mitigate quicksort's worst-case performance by spending a little more effort choosing good pivots (to reduce the number of pathological cases) and detecting pathological cases. Look up introsort. Introsort runs quicksort first, but switches to heapsort if it detects excessive recursion (which indicates a pathological case for quicksort).

OTHER TIPS

Better asymptotic doesn't imply better performance (though usually it turns out so). Hidden constant may be several times bigger, causing it to be slower that another algorithm (with same or even worst asymptotic complexity) on arrays of relatively small size (where relatively small array, in fact, may be of arbitrary size, 10100, for example. That's asymptotic analysis). But I don't know anything about smoothsort hidden constants.

For example, there is a O(n) worstcase in time algorithm for finding kth order statistic, but it's so complex that O(n log n) worstcase version outperforms it in most cases.

Also, there is an interesting comparison:

…As you can see, both Timsort and Smoothsort didn’t cut the mustard. Smoothsort is worse than STL sorts in all cases(even with std:bitset replaced with raw bit operations)…

Well first I would say that it is not like Smoothsort is not famous. It depends on the need of a user and also it depends on the user whether to use it or not.

The advantage of smoothsort is that it comes closer to O(n) time if the input is already sorted to some degree, whereas heapsort averages O(n log n) regardless of the initial sorted state.

From the Documentation:-

The smoothsort algorithm needs to be able to hold in memory the sizes of all of the heaps in the string. Since all these values are distinct, this is usually done using a bit vector. Moreover, since there are at most O(log n) numbers in the sequence, these bits can be encoded in O(1) machine words, assuming a transdichotomous machine model.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top