So, all the improvements we make are only useful when our data meets the worst case (very much sorted in the wrong direction?)
It doesn't have to be the worst case, but any sort of preexisting order in the data will do nasty things to the runtime. Preexisting order is very common, and we want a sort that takes advantage of that to run faster, not one that looks at it and barfs.
You've tested your quicksorts on random data. That's pretty much the best-case scenario for a quicksort. What if the data comes from the keys of a dict, and the hash used causes them to come out in mostly-sorted order?
>>> data = dict.fromkeys(random.sample(xrange(10000), 9000)).keys()
>>> timeit.timeit('rand_quicksort(data[:], 0, len(data)-1)', 'from __main__ impo
rt rand_quicksort, data', number=1)
0.06688880239187256
>>> timeit.timeit('hoare_quicksort(data[:], 0, len(data)-1)', 'from __main__ imp
ort hoare_quicksort, data', number=1)
# about 1000 lines omitted
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 9, in hoare_quicksort
File "<stdin>", line 4, in hoare_quicksort
RuntimeError: maximum recursion depth exceeded
Well, we get a stack overflow, and that's terrible. Even if we didn't, it'd take freaking forever.
(If you want to reproduce this result, note that you have a few bugs in your code. if p
should be if p is not None
, and random.randint(lo, hi + 1)
should be random.randint(lo, hi)
or random.randrange(lo, hi + 1)
. I had to fix those to get proper test results.)