質問

I have the following dataset:

firm_id firm_id_
1         2
1         4
1         5
2         1
2         3
3         2
3         6
4         1
4         5
4         6
5         4
5         7
6         3
...

This data says for exampe that firm_id = 1 is directly connected to firm_id = 2, 4, and 5 and indirectly connected (within two paths) to firm_id = 3, 6, and 7. I can use some Python package like networkx to build the network of firm's connectivity. Now, I want to use Spectral Clustering (I guess this the correct methodology) to form clusters based on distance (number of edges separating each firm) and see how these clusters are connected to each other.

I would first define an adjacency matrix W of the above data. I then use enter image description here where dist is the distance between firm i and firm j, and c is a scale parameter to each element in W and then compute the Laplacian matrix (see here for example).

Now my question is can Spectral Clustering give me the link between each clusters and how far apart are the clusters (how many edges separate the clusters)? I was thinking to use this, the scikit package in Python but I have no idea how I can generate the links between clusters using sklearn.cluster.

役に立ちましたか?

解決

Community detection network is what I needed:

http://perso.crans.org/aynaud/communities/

他のヒント

For spectral clustering and these approaches to work well, you need to have similarities.

Your data seems to be solely a graph, i.e. edges that connect instances. You should be looking at graph partitioning and maybe community detection algorithms that work solely on the graph structure, and do not assume you have a continuous measure of similarity.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top