How to use Disjoint Sets in Connected Component labeling?

https://stackoverflow.com/questions/8877438

16-04-2021
|

Question

I am having some hard time using Disjoint Sets in Connected Component Labeling. I have looked on many examples and also on this question where Bo Tian provided a very good implementation of Disjoint Sets as C++ linked list. I have already implemented Connected Component labeling ( labels are simple integers ) in my program but I have a really hard time resolving equivalences amongst labels with disjoint sets.

Can anyone help me on that - maybe using the Bo Tian's implementation ? I think that will also help others when they come to this point.

EDIT

My algorithm goes through the image and when it finds two labels two connected pixels with different labels it has to make a note in the 'equivalence registry' (which would be the Disjoint set forest). After looping the whole image I have to resolve the equivalences by (going second pass over the image) looking at the registry and then marking these pixels' which have equivalent labels to the minimum out of the set.

Solution

Check this tutorial on DJS. Only modification is that during union you have to connect bigger to lesser, so root is always mimimum of the set.

OTHER TIPS

The disjoint-set forest supplies two operations:

Union, which takes two objects and links them, and
Find, which takes two objects and returns the ID of the group they're in.

Disjoint-set forests are used primarily to maintain a partition of a group of objects into a family of different clusters, each of which is disjoint from the other (that is, each object is in at most one group). The disjoint-set forest then allows you to efficiently tell what objects are in each group, or (in roughly O(n) time) to determine the cluster IDs for each object.

To use a disjoint set forest, you would begin by putting each object into its own cluster. From that point forward, every time that you wanted to mark that two different objects are in the same cluster, you would use the union operation to link them together. At the very end, you would call find on each point to determine which cluster it belongs to, and from there could read off what group everything belongs to.

Hope this helps!

You are right, labeling the connected sets is only half the work done. Finding disjoint sets using equivalences is also equally hard part. I faced the exact scenario.

One way of finding disjoint sets (after labeling) is by using Union-Find algorithm. Check out the following article. You will understand from scratch how labeling and finding disjoint sets are implemented. Illustrations are also given with sample input and output matrices.

http://www.codeding.com/articles/connected-sets-labeling

There is a blog about connected component labeling with disjoint set:

http://www.keithlantz.net/2011/04/c-implementation-of-the-connected-component-labeling-method-using-the-disjoint-set-data-structure/

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow