Question

I have data in the below form, which makes up a bipartite network.

A1 - B1
A2 - B2
A2 - B1
A3 - B1
A4 - B2
A5 - B3
A6 - B3
A7 - B3
A7 - B3
A8 - B4
A9 - B3

What I would like to do is write something (ideally in python or C) or use an existing library to identify individual communities within the data. For instance

A1,A2,A3,A4 are all part of the same community because they connect to B1,B2 similarly A5,A6,A7,A8,A9 all connected to B3 and B4.

I am a bit confused having read lots various articles about network flow and graphs as to exactly where my problem sits. Is this just a form of Breadth-first search or is there a more efficient means of doing this?

Thanks

Was it helpful?

Solution

@Eli has a good idea to find the connected components. Since you know the labels (in this case anyway) start with "A" you can do it like this:

import networkx as nx
edges = """A1 - B1
A2 - B2
A2 - B1
A3 - B1
A4 - B2
A5 - B3
A6 - B3
A7 - B3
A7 - B3
A8 - B4
A9 - B3""".split('\n')
G = nx.parse_edgelist(edges,delimiter=' - ')
for component in nx.connected_components(G):
    print [n for n in component if n.startswith('A')]

OTHER TIPS

Using Python and the igraph library, you may do the following:

import igraph
graph = igraph.Graph.Formula("A1-B1, A2-B2, A2-B1, A3-B1, A4-B2, A5-B3, A6-B3, A7-B3, A8-B4, A9-B3")
comms = graph.clusters()
for comm in comms:
    print ", ".join(graph.vs[comm]["name"])

A short explanation: Graph.Formula constructs a graph out of a string representation like the one above, but you can use any other method provided by igraph to construct your graph. An advantage of using Graph.Formula is that it automatically creates a name vertex attribute containing the vertex names. graph.clusters() searches for the connected components of the network and returns a VertexClustering object. This object can be used in a for loop to iterate over the components. In the core of the for loop, the comm variable will always contain the indices of the nodes in the current community. I select the vertices of the community using graph.vs[comm], request their names as a list (graph.vs[comm]["name"]) and then join the names by commas.

If you want to use Python, read about the NetworkX library. It has lots of modules and algorithm implementations for graphs. In particular, you may find the Bipartite module useful. I'm not sure what you mean by "communities", but the bipartite_color function from that module may help you.

Maybe something like:

import collections

data = ( ("A1", "B1"), ("A2", "B2"), ("A2", "B1") )
out = collections.defaultdict(list)

for value, key in data:
  out[key].append(value)

print out
-> defaultdict(<type 'list'>, {'B1': ['A1', 'A2'], 'B2': ['A2']})

This only works one-way though. You could of course make 2 dicts, one with the A set as key and one with the B set as key. It assumes that the keys are immutable (strings, numbers).

No! Take care to use the NetworkX library because that doesn't have more than 4 function for bipartite graphs. one for verify if it is bipartite, one for coloring the nodes, one for create a simple bipartite networks without weights and another for create a projection of the bipartite networks You can may be use the last one function.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top