25th, 50th and 75th percentile of a data structure of size 360 000

Question

You could use a selection sort to find the different percentiles.

In your problem you know you need to find the 90k, 180k, and 270k positioned elements in a sorted list. Once all the 360k elements are fetched, choose a random element and split the elements to sublists based on those smaller, equal, and bigger than the element you chose. After that step, you will be able to see at what position the element you chose was in. Then, you can either choose to do the same with the smaller or bigger sublist, depending on what percentile you are looking for.

In the best case, this could be solved in O(n), as you could choose the right percentiles on the first go, but this is very unlikely. In the worst case, you could choose always the smallest element, and therefore do passes o(n) times making it o(n^2), but thats very unlikely too. Luckily, the expected running time turns out to be T(n) <= 8n, which is linear running time.

As a tip, you could gather the min/max numbers during the streaming of the data, and then you can estimate by choosing the first element to choose as (max+min)/2. This will of course be an assumption that the numbers are somehow similar to a normal distribution, and not totally off.

If you need more details on the algorithm, have a look here: http://cseweb.ucsd.edu/~dasgupta/103/4a.pdf