Quicksort not getting quicker

Question 1

So, all the improvements we make are only useful when our data meets the worst case (very much sorted in the wrong direction?)

It doesn't have to be the worst case, but any sort of preexisting order in the data will do nasty things to the runtime. Preexisting order is very common, and we want a sort that takes advantage of that to run faster, not one that looks at it and barfs.

You've tested your quicksorts on random data. That's pretty much the best-case scenario for a quicksort. What if the data comes from the keys of a dict, and the hash used causes them to come out in mostly-sorted order?

>>> data = dict.fromkeys(random.sample(xrange(10000), 9000)).keys()
>>> timeit.timeit('rand_quicksort(data[:], 0, len(data)-1)', 'from __main__ impo
rt rand_quicksort, data', number=1)
0.06688880239187256
>>> timeit.timeit('hoare_quicksort(data[:], 0, len(data)-1)', 'from __main__ imp
ort hoare_quicksort, data', number=1)
  # about 1000 lines omitted
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 9, in hoare_quicksort
  File "<stdin>", line 4, in hoare_quicksort
RuntimeError: maximum recursion depth exceeded

Well, we get a stack overflow, and that's terrible. Even if we didn't, it'd take freaking forever.

(If you want to reproduce this result, note that you have a few bugs in your code. if p should be if p is not None, and random.randint(lo, hi + 1) should be random.randint(lo, hi) or random.randrange(lo, hi + 1). I had to fix those to get proper test results.)

Question 2

Your benchmark is broken.

You're benchmarking 1000 iterations of random.randint, not your sorts.
You're running each sort only once, so you're benchmarking thread and process switching delays in your OS.

Try to precreate source array and run each sort thouthands, even millions times.

Question 3

randomize pivot selection does not make quicksort go faster: it's useful just to avoid that our algorithm executes the worst case. Let's say that we sort an already sorted vector, and we decide to pick pivot as the rightemost element of each sub-array: this contains the maximum of this subarray, so quicksort split the subarray into 2 part in the most unbalanced way. this can be prevented by randomizing. if we're sure to avoid worst-case, we can say that the algorithm take similar amount of time until each recursion level generates partition of approximatively constant balance, so we can proof that the recursion tree depth is constant