Question

This same question has been asked here so many times by several people. This is a problem which was asked in an entrance exam.

And I am having difficulties in digesting the correct answer of this problem. Why is my approach incorrect?

Here is my approach:
The divide part of the merge sort recursively divides the list of strings into sub-lists with half number of strings each.

Then the merge part takes sorted lists from two of its child node and merge them into its own sorted list. This merging propagates upwards towards the root and the entire list is eventually sorted.

The merging works this way:
The two child-lists together have $(n1+n2) = O(n)$ strings. And for comparing two strings as the part of merging, we need to compare at most $n$ positions (since the strings are of length $n$).

So the merging takes $\theta(n*n) = \theta(n^2)$

Now the recurrence equation looks like: $T(n)= 2T(n/2) + \theta(n^2)$

Using master method $n^{log_22}=n$
And $f(n)=\theta(n^2)=\Omega(n^{log_22})$

Which is case 3 of Master method. so $T(n)=\theta(f(n))=\theta(n^2)$ should be the correct answer.

Was it helpful?

Solution

The mistake and correction

$$T(n)= 2T(n/2) + \theta(n^2)$$

As you intended, $T(n)$ is the worst time of mergesorting $n$ strings of length $n$. Then, $T(n/2)$ on the RHS means the worst time of mergesorting $n/2$ strings of length $n/2$. So during the mergesort, the length of strings shrinks from $n$ to $n/2$!

What should have been done is using a different variable to express the number of strings at the start of each mergesort. Let us define $T(m, n)$ to be the cost of mergesorting $m$ strings of length $n$. Now, we will have

$$T(m, n)= 2T(m/2, n) + \theta(mn),$$

from which we can deduce $$T(m,n) = \theta(nm\log m).$$


Another correct approach

Mergesort of $n$ items takes $\theta(n\log n)$ time, assuming a unit cost of comparison. In fact, mergesort uses $\theta(n\log n)$ comparisons. Now that it costs $\theta(n)$ to compare two strings of length $n$ at worst time, the worst time of mergesort of $n$ strings of length $n$ should be $\theta(n\cdot n\log n)$.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top