Question

I am wondering whether can I use networkx to extract all possible induced subgraphs (graphlets) with specific number of nodes in the subgraphs from an input large graph, or is there another package that can do the job? For example, if I have a large graph, which is illustrated in networkx adjacency list format,

graph G:

1 2 3 7
2 1 4
3 1 4 6 5
4 2 3 5
5 3 4 6
6 3 5 7
7 1 6

which will be look like

enter image description here

if I want to extract graphlet with 3 nodes the algorithm should return me

subgraph1:

1 2 3
2 1
3 1

[(1,2),(1,3)] enter image description here subgraph2:

1 3 7
3 1
7 1

[(1,3),(1,7)] enter image description here subgraph3:

3 4 5
4 3 5
5 3 4

[(3,4),(3,5),(4,5)] enter image description here

subgraph4,subgraph5,subgraph6...

The following is the code of the question suggested by @Hooked. Let's say n=3

import itertools
target = nx.complete_graph(3)
for sub_nodes in itertools.combinations(g.nodes(),len(target.nodes())):
    subg = g.subgraph(sub_nodes)
    if nx.is_connected(subg):
        print subg.edges()

the the output will look like

[(1, 2), (1, 3)]
[(1, 2), (2, 4)]
[(1, 2), (1, 7)]
[(1, 3), (3, 4)]
[(1, 3), (3, 5)]
[(1, 3), (3, 6)]
[(1, 3), (1, 7)]
[(1, 7), (6, 7)]
[(2, 4), (3, 4)]
[(2, 4), (4, 5)]
[(3, 4), (3, 5), (4, 5)]
[(3, 4), (3, 6)]
[(3, 5), (3, 6), (5, 6)]
[(3, 6), (6, 7)]
[(4, 5), (5, 6)]
[(5, 6), (6, 7)]
Was it helpful?

Solution

This assumes you want all matching subgraphs of a given target which you'll have to define. The native way is to loop over all combinations of nodes, find those connected then check for an isomorphism. It's unclear if you want a network motif or a graphlet. In a graphlet all edges present in the original graph must be there - this would exclude 3-4-5 from your target. This method finds graphlets, to find motifs you'll have to check for each combination if there is an induced subgraph (and how many!).

import networkx as nx

g = nx.Graph()
g.add_edge(1,2);g.add_edge(1,3)
g.add_edge(1,7);g.add_edge(2,4)
g.add_edge(3,4);g.add_edge(3,5)
g.add_edge(3,6);g.add_edge(4,5)
g.add_edge(5,6);g.add_edge(6,7)

import itertools

target = nx.Graph()
target.add_edge(1,2)
target.add_edge(2,3)

for sub_nodes in itertools.combinations(g.nodes(),len(target.nodes())):
    subg = g.subgraph(sub_nodes)
    if nx.is_connected(subg) and nx.is_isomorphic(subg, target):
        print subg.edges()

For me, this gives the edge set matches of:

[(1, 2), (1, 3)]
[(1, 2), (2, 4)]
[(1, 2), (1, 7)]
[(1, 3), (3, 4)]
[(1, 3), (3, 5)]
[(1, 3), (3, 6)]
[(1, 3), (1, 7)]
[(1, 7), (6, 7)]
[(2, 4), (3, 4)]
[(2, 4), (4, 5)]
[(3, 4), (3, 6)]
[(3, 6), (6, 7)]
[(4, 5), (5, 6)]
[(5, 6), (6, 7)]

Your examples are listed in here.

OTHER TIPS

For people who ended up here having the same problem but have too many nodes, here are few simple improvements on @Hooked's answer (although I am sure there are better solutions out there as @Hooked mentioned in comments, this is just a quick copy-paste fix for people who ended up here with the same reason as I did and had scaling problems)

1) igraph scales way better than networkx

2) we can only take a neighborhood of a node to eliminate most of the unnecessary combinations

For example if we are looking for a motif in larger network (both igraph objects)

    motif_rank = max(max(motif.shortest_paths_dijkstra()))
    result = collections.OrderedDict.fromkeys(network.vs['label'], 0)

    for node in self.network.vs:
        # Get relevant nodes around node of interest that might create the motif of interest
        nodes_to_expand = {node}
        for rank in range(motif_rank):
            nodes_expanded = nodes_to_expand
            for node_to_expand in nodes_to_expand:
                nodes_expanded = set.union(nodes_expanded, set(node_to_expand.neighbors()))
            nodes_to_expand = nodes_expanded

        # Look at all combinations
        for sub_nodes in itertools.combinations(nodes_to_expand, motif.vcount()):
            subg = network.subgraph(sub_nodes)
            if subg.is_connected() and subg.isomorphic(motif):
                result[node['label']] = result[node['label']]+1
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top