Question

Here is my Objective-C implementation of a disjoint set. - Positive number point to parent. - Negative number indicate root & children count. (So they each start disjointed at -1.) - The index acts as the data I am grouping. Seems to work ok... just had a couple questions.

  1. find: How can I compress the path? Because I am not doing it recursively, do I have to store the path and loop it again to set after find root?

  2. join: I am basing join on children count instead of depth!? I guess that is not right. Do I need to do something special during join if depths equal?

Thanks.

DisjointSet.h

@interface DisjointSet : NSObject
{
    NSMutableArray *_array;
}

- (id)initWithSize:(NSInteger)size;
- (NSInteger)find:(NSInteger)item;
- (void)join:(NSInteger)root1 root2:(NSInteger)root2;

@end

DisjointSet.m

#import "DisjointSet.h"

@implementation DisjointSet

- (id)initWithSize:(NSInteger)size
{
    self = [super init];
    if (self)
    {
        _array = [NSMutableArray arrayWithCapacity:size];
        for (NSInteger i = 0; i < size; i++)
        {
            [_array addObject:[NSNumber numberWithInteger:-1]];
        }
    }
    return self;
}

- (NSInteger)find:(NSInteger)item
{
    while ([[_array objectAtIndex:item] integerValue] >= 0)
    {
        item = [[_array objectAtIndex:item] integerValue];
    }
    return item;
}

- (void)join:(NSInteger)root1 root2:(NSInteger)root2
{
    if (root1 == root2) return;

    NSInteger data1 = [[_array objectAtIndex:root1] integerValue];
    NSInteger data2 = [[_array objectAtIndex:root2] integerValue];
    if (data2 < data1)
    {
        [_array setObject:[NSNumber numberWithInteger:data2 + data1] atIndexedSubscript:root2];
        [_array setObject:[NSNumber numberWithInteger:root2] atIndexedSubscript:root1];
    }
    else
    {
        [_array setObject:[NSNumber numberWithInteger:data1 + data2] atIndexedSubscript:root1];
        [_array setObject:[NSNumber numberWithInteger:root1] atIndexedSubscript:root2];
    }
}

@end
Was it helpful?

Solution

For the find operation, there is no need to store the path (separately from your _array) or to use recursion. Either of those approaches requires O(P) storage (P = path length). Instead, you can just traverse the path twice. The first time, you find the root. The second time, you set all of the children to point to the root. This takes O(P) time and O(1) storage.

- (NSInteger)findItem:(NSInteger)item {
    NSInteger root;
    NSNumber *rootObject = nil;
    for (NSInteger i = item; !rootObject; ) {
        NSInteger parent = [_array[i] integerValue];
        if (parent < 0) {
            root = i;
            rootObject = @(i);
        }
        i = parent;
    }

    for (NSInteger i = item; i != root; ) {
        NSInteger parent = [_array[i] integerValue];
        _array[i] = rootObject;
        i = parent;
    }

    return root;
}

For the merge operation, you want to store each root's rank (which is an upper bound on its depth), not each root's descendant count. Storing each root's rank allows you to merge the shorter tree into the taller tree, which guarantees O(log N) time for find operations. The rank only increases when the trees to be merged have equal rank.

- (void)joinItem:(NSInteger)a item:(NSInteger)b {
    NSInteger aRank = -[_array[a] integerValue];
    NSInteger bRank = -[_array[b] integerValue];
    if (aRank < bRank) {
        NSInteger t = a;
        a = b;
        b = t;
    } else if (aRank == bRank) {
        _array[a] = @(-aRank - 1);
    }

    _array[b] = @(a);
}

OTHER TIPS

You definitely should implement path compression using recursion. I would not even think about trying to do it non-recursively.

Implementing the disjoin-set datastructure should be very easy, and can be done in few lines. Its very, very easy to translate it from the pseudocode to any programming language. You can find the pseudocode on Wikipedia. (Unfortunately, I can't read Objective-C, so I cannot really judge wether your code is correct or not).

Yes. To implement highest ancestor compression without recursion you need to maintain your own list. Make one pass up the chain to get pointers to the sets that need their parent pointers changed and also to learn the root. Then make a second pass to update the necessary parent pointers.

The recursive method is doing the same thing. The first pass is the "winding up" of the recursion, which stores the sets needing parent pointer updates on the program stack. The second pass is in reverse as the recursion unwinds.

I differ with those who say the recursive method is always best. In a reasonable number systems (especially embedded ones), the program stack is of limited size. There are cases where many unions are performed in a row before a find. In such cases, the parent chain can be O(n) in size for n elements. Here collapsing by recursion can blow out the stack. Since you are working in Objective C, this may be iOS. I do not know the default stack size there, but if you use recursion it's worth looking at. It might be smaller than you think. This article implies 512K for secondary threads and 1Mb for the main thread.

Iterative, constant space alternative

Actually the main reason I'm writing is to point out that you still get O(log^* n) for n ammortized operations -- just a shade less efficient than collapsing, and still effectively O(1) -- if you only do factor-of-two compression: in the find operation, change parent pointers so that they point to the grandparents instead instead of the root. This can be done with iteration in constant storage. This lecture at Princeton talks about this algorithm and implements it in a loop with 5 lines of C. See slide 29.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top