Best-case performance of disjoint set forests, and proving lower bounds of algorithms

https://stackoverflow.com/questions/5346467

27-10-2019
|

Question

There is a question on an assignment that was due today which solutions have been released for, and I don't understand the correct answer. The question deals with best-case performance of disjoint sets in the form of disjoint set forests that utilize the weighed union algorithm to improve performance (the smaller of the trees has its root connected as a child to the root of the larger of the two trees) but without using the path compression algorithm.

The question is whether the best case performance of doing (n-1) Union operations on n singleton nodes and m>=n Find operations in any order is Omega(m*logn) which the solution confirms is correct like this:

There is a sequence S of n-1 Unions followed by m >= n Finds that takes Omega(m log n) time. The sequence S starts with a sequence n-1 Unions that builds a tree with depth Omega(log n). Then it has m>=n Finds, each one for the deepest leaf of that tree, so each one takes (log n) time.

My question is, why does that prove the lower bound is Omega(m*logn) is correct? Isn't that just an isolated example of when the bound would be Omega(m*logn) that doesn't prove it for all inputs? I am certain one needs to only show one counter-example when disproving a claim but needs to prove a predicate for all possible inputs in order to prove its correctness.

In my answer, I pointed out the fact that you could have a case when you start off by joining two singleton nodes together. You then join in another singleton to that 2-node tree with 3 nodes sharing the same parent, then another etc., until you join together all the n nodes. You then have a tree where n-1 nodes all point up to the same parent, which is essentially the result you obtain if you use path compression. Then every FIND is executed in O(1) time. Thus, a sequence of (n-1) Unions and m>=n Finds ends up being Omega(n-1+m) = Omega(n+m) = Omega(m).

Doesn't this imply that the Omega(m*logn) bound is not tight and the claim is, therefore, incorrect? I'm starting to wonder if I don't fully understand Big-O/Omega/Theta :/

EDIT : fixed up the question to be a little clearer

EDIT2: Here is the original question the way it was presented and the solution (it took me a little while to realize that Gambarino and the other guy are completely made up; hardcore Italian prof)

Solution

Seems like I indeed misunderstood the concept of Big-Omega. For some strange reason, I presumed Big-Omega to be equivalent to "what's the input into the function that results in the best possible performance". In reality, most likely unsurprisingly to the reader but a revelation to me, Big-Omega simply describes the lower bound of a function. That's it. Therefore, a worst case input will have a lower and upper bounds (big-O and omega), and so will the best possible input. In case of big-omega here, all we had to do was come up with a scenario where we pick the 'best' input given the limitations of the worst case, i.e. that there is some input of size n that will take the algorithm at least m*logn steps. If such input exists, then the lower bound is tight.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow