Question

I have an application that creates many thousands of graphs in memory per second. I wish to find a way to persist these for subsequent querying. They aren't particularly large (perhaps max ~1k nodes).

I need to be able to store the entire graph object including node attributes and edge attributes. I then need to be able to search for graphs within specific time windows based on a time attribute in a node.

Is there a simple way to coerce this data into neo4j ? I've yet to find any examples of this. Though I have found several python libs including an embedded neo4j and a rest client.

Is the common approach to manually traverse the graph and store it in that manner?

Are there any better persistence alternatives?

Was it helpful?

Solution

Networkx has several serialization method.

In your case, I would choose graphml serialization :

http://networkx.github.io/documentation/latest/reference/readwrite.graphml.html

It's quite simple to use :

import networkx as nx
nx.write_graphml('/path/to/file')

To load it in Neo4j, provided you have Neo4j<2.0, you could use Tinkerpop Gremlin to load your graphml dump in Neo4J

g.loadGraphML('/path/to/file')

The Tinkerpop are quite useful - not only for serialization/deserialization.

It will allow you to use different graph database with a commion "dialect" (provided they have a "Blueprint" driver which most of them do)

OTHER TIPS

networkx supports flexible container structures (eg arbitrary combinations of py lists and dicts) in both nodes and edges.

Are there restrictions on the Neo4j side to persist such flexible data?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top