Subgraph matching (JUNG)

https://stackoverflow.com/questions/9846094

26-05-2021
|

Question

I have a set of subgraphs and I need to match them on the graph they were extracted from. I also need to count how many times each subgraph shows up in such graph (I need to store all possible matches). There must be a perfect match considering the edges' labels of both subgraph and graph, the vertices' labels, however, don´t need to match each other. I built my system using JUNG API, so I would like a solution (api, algorithm etc) that could deal with the Graph structure provided by JUNG. Any thoughts?

La solution 2

Well, I had to solve the problem by implementing it from scratch. I followed the strategy suggested in the topic Any working example of VF2 algorithm?. So, if someone is in doubt about this problem too, I suggest to take a look at Rich Apodaca's answer in the aforementioned topic.

Autres conseils

JUNG is very full-featured, so if there isn't a graph analysis algorithm in JUNG for what you need, there's usually a strong, theoretical reason for it. To me, your problem sounds like an instance of the Subgraph Isomorphism problem, which is NP-Complete:

http://en.wikipedia.org/wiki/Subgraph_isomorphism_problem

NP-Completeness may or may not be familiar to you (it took me 7 years of college and Master's Degree in Computer Science to understand it!), so I'll give a high-level description here. Certain problems, like sorting, can be solved in Polynomial time with respect to their input size. For example, if I have a list of N elements, I can sort it in O(N log(N)) time. More specifically, if I can solve a problem in Polynomial time, this means I can solve the problem without exhausting every possible solution. In the sorting case, I could traverse every possible permutation of the list and, if I found a permutation of the list that was sorted, return it. This is obviously not the fastest way to solve the problem though! Some very clever mathematicians were able to get it down to its theoretical minimum of O(N log(N)), thus we can sort really big lists of things quite quickly using computers today.

On the flip-side, NP-Complete problems are thought to have no Polynomial time solution (I say thought because no one has ever proven it, although evidence strongly suggests this is the case). Anyway, what this means is that you cannot definitively solve an NP-Complete problem without first exhausting every possible solution. The time complexity of NP-Complete problems are always O(c ^ N) or worse, where c is some constant greater than 1. This means that the time required to solve the problem grows exponentially with every incremental increase in problem size.

So what does this have to do with my problem???

What I'm getting at here is that, if the Subgraph Isomorphism problem is NP-Complete, then the only way you can determine if one graph is a subgraph of another graph is by exhausting every possible solution. So you can solve this, but probably only up to graphs of a few nodes or so (since the problem's time complexity grows exponentially with every incremental increase in graph size). This means that it is computationally infeasible to compute a solution for your problem because as soon as you reach a certain graph size, it will quite literally take forever to find a solution.

More practically, if your boss asks you to do something that is provably NP-Complete, you can simply say it's impossible and he will have to listen to you. If your professor asks you to do something that is provably NP-Complete, show him that it's NP-Complete and you'll probably get an A for the course. If YOU are trying to do something NP-Complete of your own accord, it's better to just move on to the next project... ;)

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow