Question

I am trying to cluster data using the implementation of the Markov Clustering (mcl) algorithm at micans.org . I read in a description of the algorithm that it was possible to assign one element to several clusters. How can I do that? So far, I can get the clusters with unique assignments for points,

Was it helpful?

Solution

It is possible. This has never been a well-researched feature however, and one issue is that there are at the moment several choices available, foremost the mcl iterand to work with. One way to approach it is as follows:

mcl MCL-GRAPH-FILE -o foobar -dump dag -dump-interval 1:6 -wself 0.4 -wmax 0.4
clm imac -imx dag-1.foobar -overlap keep -o imac-1.foobar
clm imac -imx dag-2.foobar -overlap keep -o imac-2.foobar
clm imac -imx dag-3.foobar -overlap keep -o imac-3.foobar
clm imac -imx dag-4.foobar -overlap keep -o imac-4.foobar
clm imac -imx dag-5.foobar -overlap keep -o imac-5.foobar

For a (small) graph with 150 nodes this reports the following (besides saving the results):

[clmmate] kept <7> instances of overlap
[clmmate] kept <47> instances of overlap
[clmmate] kept <37> instances of overlap
[clmmate] kept <19> instances of overlap
[clmmate] kept <6> instances of overlap

This shows that the overlap associated with mcl iterands increases, peaks and then decreases again. It is best to work with an 'mcl graph file' and a separate file keeping track of the labels. Look for example at http://micans.org/mcl/man/clmprotocols.html#internal. The -wself and -wmax parameters instruct mcl how to reduce the iterand to a sparser graph. This is the graph that is dumped (with a 'dag' prefix). In the example the parameters are very loose and the resulting graph may in fact not be a DAG (directed acyclic graph). With the default parameters it should be a DAG.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top