Worst case $O(n \ln n)$ in place stable sort?

https://cs.stackexchange.com/questions/2569

16-10-2019
|

Question

I am having trouble finding good resources that give a worst case $O(n \ln n)$ in place stable sorting algorithm. Does anyone know of any good resources?

Just a reminder, in place means it uses the array passed in and the sorting algorithm is only allowed to use constant extra space. Stable means that elements with the same key appear in the same order in the sorted array as they did in the original.

For example, naive merge sort is worst case $O(n \ln n)$ and stable but uses $O(n)$ extra space. Standard quicksort can be made stable, is in place but is worst case $O(n^2)$. Heapsort is in place, worst case $O(n \ln n)$ but isn't stable. Wikipedia has a nice chart of which sorting algorithms have which drawbacks. Notice that there is no sorting algorithm that they list that has all three conditions of stability, worst case $O(n \ln n)$ and being in place.

I have found a paper called "Practical in-place mergesort" by Katajainen, Pasanen and Teuhola, which claims to have a worst case $O(n \ln n)$ in place stable mergesort variant. If I understand their results correctly, they use (bottom-up?) mergesort recursively on the first $\frac{1}{4}$ of the array and the latter $\frac{1}{2}$ of the array and use the second $\frac{1}{4}$ as scratch space to do the merge. I'm still reading through this so any more information on whether I'm interpreting their results correctly is appreciated.

I would also be very interested in a worst case $O(n \ln n)$ in place stable quicksort. From what I understand, modifying quicksort to be worst case $O(n \ln n)$ requires selecting a proper pivot which would destroy the stability that it would otherwise normally enjoy.

This is purely of theoretical interest and I have no practical application. I would just like to know the algorithm that has all three of these features.

Solution

There are several algorithms that are all of the above, and pretty much all of them have been invented in the last 30 years.

Probably the nicest is the class of algorithms called Block sort, including the version (called WikiSort) by Kim and Kutzner in 2008. It is not only stable and completely in-place (O(1) memory overhead in the transdichotomous model), it is also adaptive, and thus will take fewer steps to sort nearly sorted lists, converging to O(n) comparisons in the case of an already sorted list. You can find an implementation in C, C++, and Java here: https://github.com/BonzaiThePenguin/WikiSort

Also of interest is the GrailSort algorithm (also a Block sort) by Huang and Langston (1989-1992), which actually outperforms WikiSort on several types of test cases. A C++ implementation is available here: https://github.com/Mrrl/GrailSort

OTHER TIPS

You can write an in-place, stable mergesort. See this for details. In the author's own words:

A beautiful in place - merge algorithm. Test it on inverted arrays to understand how rotations work. Fastest known in place stable sort. No risk of exploding a stack. Cost: a relatively high number of moves. Stack can still be expensive too. This is a merge sort with a smart in place merge that 'rotates' the sub arrays. This code is litteraly copied from the C++ stl library and translated in Java.

I won't copy the code here, but you can find it at the link or by checking the C++ STL. Please let me know if you would like me to try to provide a more detailed description of what's going on here.

Please take this as a long comment on some practical thoughts. Although this is not an answer to your question, I think you might be interested in this Python discussion:

This describes an adaptive, stable, natural mergesort, modestly called timsort (hey, I earned it ). It has supernatural performance on many kinds of partially ordered arrays (less than $\lg(N!)$ comparisons needed, and as few as $N-1$), yet as fast as Python's previous highly tuned samplesort hybrid on random arrays.

[...]

Merging adjacent runs of lengths A and B in-place is very difficult. Theoretical constructions are known that can do it, but they're too difficult and slow for practical use. But if we have temp memory equal to min(A, B), it's easy.

Source: bugs.python.org, author: Tim Peters

So it seems as if Timsort still has advantages over an stable, in-place, $\mathcal{O}(n \log n)$ worst-case time complexity merge sort.

Also note that Timsort performs good on already-sorted arrays.

So Python makes use of Timsort (which is Mergesort with some tweaks) and as I've looked up the Java implementation some years ago, it was also Mergesort (I think they now also use Timsort).

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange