Question

I have some graph databases (friends networks, purchasing history, etc.) that I persist with Neo4j. I plan to analyze these with community detection algorithms such as Girvan Newman. These algorithms usually return a dendrogram, representing the division of the graph from whole network to individual nodes. I am wondering how I might persist these results. I suppose it could be stored as a separate graph, but is there a way to store it within the graph itself? My concern in doing so is the need for creating nodes to represent the groups, which is something I would like to avoid.

Was it helpful?

Solution

One way to represent a dendrogram is as a list of pairs, containing (n-1) pairs for n elements. Assuming the left element of the pair is the one whose ID is kept to refer to all elements in a community, a sample dendrogram might look like

[[0,1],[2,3],[0,2]]

So an alternative way to persist that might be to store at each node at which time step it is merged into another node (together with all the nodes that have been previously merged into it).

So you'd attach (0:0) to 1, (1:2) to 3 and (2:0) to 2 (timestep:new 'name' of node).

edit: Concretely, this might mean attaching two integer-valued attributes e.g. 'merge_timestep' and 'merge_into' to each Neo4J node object.

OTHER TIPS

Most community detection algorithms work by agglomerating communities along existing edges in the graph; Girvan-Newman is a little unusual in that it works by cutting edges. Either way, the dendrogram can be viewed as showing an ordering of operations on the edges of the graph. Thus, instead of storing the dendrogram as a separate object, you can attach properties to the edges (relationships) showing in which order they should be merged/cut. My knowledge of Neo4j is extremely limited, so I'll leave the details to you.

There are some complications with merging, as there will generally be multiple equivalent edges, each linking different vertices within the communities to merge. Basically, just pick a strategy that lets you figure out the linked communities from the edges.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top