Domanda

I've read that qsort is just a generic sort, with no promises about implementation. I don't know about how libraries vary from platform to plaform, but assuming the Mac OS X and Linux implementations are broadly similar, are the qsort implementations recursive and/or require a lot of stack?

I have a large array (hundreds of thousands of elements) and I want to sort it without blowing my stack to oblivion. Alternatively, any suggestions for an equivalent for large arrays?

È stato utile?

Soluzione

Here's a version from BSD, copyright Apple, presumably used in OS X at some time or another:

http://www.opensource.apple.com/source/xnu/xnu-1456.1.26/bsd/kern/qsort.c

It is call-recursive, although the upper bound on the depth of recursion is small, as Blindy explains.

Here's a version from glibc, presumably used in Linux systems at some time or another:

http://www.umcs.maine.edu/~chaw/200801/capstone/n/qsort.c

It's not call recursive. For exactly the same reason that the limit on call-recursion is small, it can use a small fixed amount of stack to manage its loop-recursion.

Can I be bothered to look up the latest versions? Nope ;-)

For a few hundred thousand array elements, even the call-recursive implementation won't call more than 20 levels deep. In the grand scheme of things that is not deep, except on very limited embedded devices, which wouldn't have enough memory for you to have an array that big to sort in the first place. When N is bounded above, O(log N) obviously is a constant, but more than that it's normally quite a manageable constant. Usually 32 or 64 times "small" is "reasonable".

Altri suggerimenti

You know, the recursive part is logn deep. In 64 levels of recursion (which is ~64*4=~256 bytes of stack total) you can sort an array of size ~2^64, ie an array as large as you can address on a 64 bit cpu, which is 147573952589676412928 bytes for 64 bit integers. You can't even hold it in memory!

Worry about stuff that matters imo.

Yes it's recursive. No, it probably will not use large amounts of stack. Why not simply try it? Recursion is not some kind of bogey - it's the solution of choice for very many problems.

A properly implemented qsort does not require more than log2(N) levels of recursion (i.e. depth of stack), where N is the largest array size on the given platform. Note that this limit applies regardless of how good or bad the partitioning happens to be, i.e. it is the worst case depth of recursion. For example, on a 32-bit platform, the depth of recursion will never exceed 32 in the worst possible case, given a sane implementation of qsort.

In other words, if you are concerned about the stack usage specifically, you have nothing to worry about, unless you are dealing with some strange low-quality implementation.

I remember reading in this book: C Programming: A Modern Approach that the ANSI C specification doesn't define how to implement qsort.

And the book wrote that qsort could in reality be a another kind of sort, merge sort, insertion sort and why not bubble sort :P

So, the qsort implementation might not be recursive.

With quicksort, the stack will grow logarithmically. You will need a lot of elements to blow up your stack.

I'd guess that most modern implementations of qsort actually use the Introsort algorithm. A reasonably written Quicksort won't blow the stack anyway (it'll sort the smaller partition first, which limits stack depth to logarithmic growth).

Introsort goes a step further though -- to limit the worst case complexity, if it sees that Quicksort isn't working well (too much recursion, so it could have O(N2) complexity), it'll switch to a Heapsort which guarantees O(N log2 N) complexity and limits stack usage as well. Therefore, even if the Quicksort it uses is sloppily written, the switch to Heapsort will limit stack usage anyway.

A qsort implementation which can fail on large arrays is extremely broken. If you're really worried I'd go RTFS, but I suspect any half-decent implementation will either use an in-place sorting algorithm or use malloc for temporary space and fall back to an in-place algorithm if malloc fails.

The worst-case space-complexity of a naive quicksort implementation (which is still a popular option for qsort) is O(N). If the implementation is modified to sort the smaller arary first and tail-recursion optimisation or an explicit stack and iteration is used then the worst case space can be brought down to O(log N), (what most answers here wrote already). So, you will not blow up your stack if the implementation of quick-sort is not broken and the library was not broken by improper compiler flags. But, for example, most compiler which support tail recursion elimination won't do this optimization it in unoptimized debug builds. A library built with the wrong flags (i.e. not enough optimization, for example in the embedded domain where you sometimes build your own debug libc) might crash the stack then.

For most developers, this will never be an issue (they have vendor tested libc's which have O(log N) space complexity), but I'd say it is a good idea to have an eye on potential library issues from time to time.

UPDATE: Here's an example for what I mean: A bug in libc (from 2000) where qsort would start thrashing virtual memory because the qsort implementation would switch internally to mergesort because it though there is enough memory to hold a temporary array.

http://sources.redhat.com/ml/libc-alpha/2000-03/msg00139.html

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top