Question

I'm stuck with the following problem by Skiena (The Algorithm Design Manual, p. 106):

Problem: Give an efficient algorithm to determine whether two sets (of size $m$ and $n$, respectively) are disjoint. Analyze the worst-case complexity in terms of $m$ and $n$, considering the case where $m$ is substantially smaller than $n$.

Solution: First sort the big set – The big set can be sorted in $O(n\log n)$ time. We can now do a binary search with each of the m elements in the second, looking to see if it exists in the big set. The total time will be $O((n + m)\log n)$.

My question: Why is the running time $O((n + m)\log n)$? I would have to perform m binary searches in total (one binary search for every element in $m$) - as one binary search has a running time of $O(\log n)$, I would have to perform $m \cdot \log n$ operations in the worst case. How does this - if at all - translate into $O((n + m)\log n)$?

Was it helpful?

Solution

Sorting the big set takes time $O(n\log n)$. You perform $m$ binary searches, each taking $O(\log n)$, for a total of $O(m\log n)$ time spent on binary search. The total running time of the algorithm is thus $$ O(n\log n + m\log n) = O((n + m)\log n) = O(n\log n), $$ assuming $m \leq n$.

Note that it is even better to sort the smaller list, since then the running time improves to $O(n\log m)$.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top