Kruskal's algorithm and disjoint-set data structure: Do I need the following two lines of code?

https://stackoverflow.com/questions/5442275

12-11-2019
|

문제

I've implemented Kruskal's algorithm in C++ using the disjoint-set data structure according to Wikipedia like this:

#include <stdio.h>
#include <algorithm>
#define MAX_EDGES 10000000
#define MAX_VERTICES 200001
using namespace std;
int num_edges,num_vertices;
int total_cost=0;

struct edge{
    int v1,v2;
    int cost;
};

struct comp{
    bool operator()(const edge& e1,const edge& e2){
        return e1.cost<e2.cost;
    }
};

edge edges[MAX_EDGES];
int parent[MAX_VERTICES];
int rank[MAX_VERTICES];

int findset(int x){
    if(x!=parent[x]){
        parent[x]=findset(parent[x]);
    }
    return parent[x];
}

void merge(int x,int y){
    int px=findset(x),py=findset(y);
    if(rank[px]>rank[py]){
        parent[py]=px;
    }else{
        parent[px]=py;
    }
    if(rank[px]==rank[py]){
        ++rank[py];
    }
}

int main(){
    FILE* in=fopen("input","r");
    FILE* out=fopen("output","w");
    fscanf(in,"%d %d\n",&num_vertices,&num_edges);
    for(int i=1;i<=num_vertices;++i){
        parent[i]=i;
        rank[i]=0;
    }
    for(int i=0;i<num_edges;++i){
        fscanf(in,"%d %d %d\n",&edges[i].v1,&edges[i].v2,&edges[i].cost);
    }
    sort(edges,edges+num_edges,comp());
    for(int i=0;i<num_edges;++i){
        int s1=findset(edges[i].v1),s2=findset(edges[i].v2);
        if(s1!=s2){
            merge(s1,s2);
            total_cost+=edges[i].cost;
        }
    }
    fprintf(out,"%d\n",total_cost);
}

My question is: Do I need these two lines of code? If so, what's their importance?

int px=findset(x),py=findset(y); in merge instead of int px=parent[x],py=parent[y];
parent[x]=findset(parent[x]); in findset instead of return findset(parent[x]);

해결책

1) findset(x) returns the canonical representative of the set that x is in (the root of its ancestry tree). You need this to be able to compare whether two elements are in the same set or not (they have the same representative), parent[x] just returns the parent of x in the tree, which may not be the root.

1a) You forgot to test for px and py being identical in merge.

2) It's an optimization so that future calls to findset will run faster. If parent[x] used to point to its parent which pointed to the root of its set's tree, after this call parent[x] will point directly to the root.

다른 팁

You need this to because x.parent is not necessarily the representative of the class to which x belongs, so the algorithm wouldn't be correct without it.
Without the assignment, the algorithm would be suboptimal. This is the path compression optimization, also discussed in the Wikipedia.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow