What parameters can I play with using mcl?

Question

There are really mainly two things to consider. The first and most important is outside mcl (http://micans.org/mcl/) itself, namely how the network is constructed. I've written about it elsewhere, but I'll repeat it here because it is important.

If you have a weighted similarity, choose an edge-weight (similarity) cutoff such that the topology of the network becomes informative; i.e. too many edges or too few edges yield little discriminative information in the absence/presence structure of edges. Choose it such that no edges connect things you consider very dissimilar, and that edges connect things you consider somewhat similar to quite similar. In the case of mcl, the dynamic range in edge weight between 'a bit similar' and 'very similar' should be, as a rule of a thumb, one order of magnitude, i.e. two-fold or five-fold or ten-fold, as opposed to varying from 0.9 to 1.0. Of course, it is possible to give simple networks to mcl and it will just utilise the absence/presence of edges. Make sure the network does not become very dense - a very rough rule of thumb could be to aim for a total number of edges that is in the order of V * sqrt(V) if the number of nodes (vertcies) is V, that is, each node has, on average, in the order of sqrt(V) neighbours.

The above, network construction, is really crucial, and it is advisable to try different approaches. Now, given a network, there is really only one mcl parameter to vary: the inflation parameter (the -I option). A good set of values to test with is 1.4, 2, 3, 4, 6.

In summary, if you are exploring, try different ways of network construction, using your knowledge of the data to make the network a meaningful representation, and combine this with trying different mcl inflation values.