Extract subgraph in neo4j

Question 1

I solved it by constructing the induced subgraph based on all traversal endpoints.

Building the subgraph from the set of last nodes and edges of every traversal does not work, because edges that are not part of any shortest paths would not be included.

The code snippet looks like this:

Set<Node> nodes = new HashSet<Node>();
Set<Relationship> edges = new HashSet<Relationship>();

for (Node n : traverser.nodes())
{
    nodes.add(n);
}

for (Node node : nodes)
{
    for (Relationship rel : node.getRelationships())
    {
        if (nodes.contains(rel.getOtherNode(node)))
            edges.add(rel);
    }
}

Every edge is added twice. One time for the outgoing node and one time for the incoming node. Using a Set, I can ensure that it's in the collection only once.

It is possible to iterate over incoming/outgoing edges only, but it is unclear how loops (edge from a node to itself) are handled. To which category do they belong to? This snippet does not have this issue.

Question 2

As for graph matching, that has been superseded by http://docs.neo4j.org/chunked/snapshot/cypher-query-lang.html which would fit nicely, and supports fuzzy matchin with optional relationships.

For subgraph representation, I would use the Cypher output to maybe construct new Cypher statements for recreating the graph, much like a SQL export, something like

start n=node:node_auto_index(name='Neo') 
match n-[r:KNOWS*]-m 
return "create ({name:'"+m.name+"'});"

http://console.neo4j.org/r/pqf1rp for an example

Question 3

See dumping the database to cypher statements

dump START n=node({self}) MATCH p=(n)-[r:KNOWS*]->(m) RETURN n,r,m;

There's also an example for importing the subgraph of first database (db1) into a second (db2).