Question

I was going through the traditional quick sort algorithm. I had a look on the partition algorithm in a couple of places and the implementation difference was very subtle. Here are the 2 approaches: Approach 1: Pivot is last element

partition (arr[], low, high)
{
    // pivot (Element to be placed at right position)
    pivot = arr[high];  

    i = (low - 1)  // Index of smaller element

    for (j = low; j <= high- 1; j++)
    {
        // If current element is smaller than or
        // equal to pivot
        if (arr[j] <= pivot)
        {
            i++;    // increment index of smaller element
            swap arr[i] and arr[j]
        }
    }
    swap arr[i + 1] and arr[high])
    return (i + 1)
}

Approach 2: Pivot is first element

Partition(a[], l, h)
{
    pivot = a[l];
    i - l; j = ;h
    while(i  < j)
    {
        while(a[i] <= pivot)
            i++;
        while(a[i] > pivot)
            j--;
        if(i < j)
            swap(a[i], a[j]);
    }
    swap(a[l], a[j]);
    return j;
}

In Approach 1, for loop is used only once hence they call it linear time but in approach 2, there are nested loops. But according to my suspicion, the task performed by those nested loop do not incur more than linear time.

Please throw some light on this. Thanks in advance.

Was it helpful?

Solution

Can nested loop have linear time complexity?

Yes. Consider this function:

def linear_time(n):
    for i in range(sqrt(n)):
        for j in range(sqrt(n)):
            something_that_takes_constant_time()

Its time complexity is O(sqrt(n) * sqrt(n)) = O(n).

Remember:

def foo(x):
    part_a(x)
    part_b(x)
    # if part_a(x) has a time complexity of O(f(x))
    # and if part_b(x) has a time complexity of O(g(x))
    # then foo(x) has a time complexity of O(f(x) + g(x))

def foo(x):
    loop f(x) times:
        part_of_loop(x)
    # if part_of_loop(x) has a time complexity of O(g(x))
    # then foo(x) has a time complexity of O(f(x) * g(x))

So let's examine this algorithm:

Partition(a[], l, h)
{
    pivot = a[l]; # O(1)
    i = l; j = h; # O(1)
    while(i < j) # runs ??? times
    {
        while(a[i] <= pivot) # runs O(h - l) times
            i++; # O(1)
        while(a[j] > pivot) # runs O(h - l) times
            j--; # O(1)
        if(i < j) # O(1)
            swap(a[i], a[j]); # O(1)
    }
    swap(a[l], a[j]); # O(1)
    return j; # O(1)
}

So the inner part of the loop has a time complexity of O((h - l) * 1 + (h - l) * 1 + 1 * 1) = O(h - l). But what is the time complexity of the whole loop? It runs as often as (i < j) is true.

The post condition after while(a[i] <= pivot) is a[i] > pivot, and after while(a[j] > pivot) we have a[j] <= pivot. If i < j (the only case we care about, otherwise the loop will end anyway!), the values at the two indices are swapped, meaning that a[i] <= pivot and a[j] > pivot will be true once more. So, what it does is find the leftmost and rightmost indices that are on the wrong side of the pivot and swap them, until all of them are partitioned right. So how often does that loop run? We can't really say, it depends on how many elements are on the "wrong" side of the pivot. We do know that every loop after the first one i and j will be at least one step closer to each other, so the upper bound is going to be (h - l) / 2 + 1 which is of complexity O(h - l).

Now, I skipped over something: the number of times the outer loop runs is dependent on the number of times the inner loop runs. As the number of times the inner loops run tends towards O(h - l), the number of times the outer loop takes tends towards 1 and vice versa. They're sort of eating up each other's runs. The upper bound I mentioned in the previous paragraph is what we get when the inner loops run only once each time, making their complexity O(1). And it turns out, that whenever the inner loops run a and b times respectively, the number of times the outer loop runs in O((h - l) / (a + b). In other words, the complexity of the content of the outer loop is O(a + b) and the number of times the outer loop runs has complexity O((h - l) / (a + b), making the whole loop have a time complexity of O((h - l) / (a + b) * (a + b)) = O(h - l), and thus this partition works in linear time.

(I do think there is a bug here: what happens if a[i] <= a[l] for all l <= i <= h?)

OTHER TIPS

Nested loops are actually irrelevant to O. What counts is how many times you deal with an item. In general your loops define your nesting but this is a special case where the loops rapidly get smaller as you dig down. (Note that if you somehow select a very bad pivot at each iteration the loops don't shrink fast and you end up with an O(n^2) runtime. Choosing the first item of the list each time and trying to sort an already-sorted list will cause this.)

Licensed under: CC-BY-SA with attribution
scroll top