Question

I use py2neo (v 1.9.2) to write data to a neo4j db.

batch = neo4j.WriteBatch(graph_db)
current_relationship_index = graph_db.get_or_create_index(neo4j.Relationship, "Current_Relationship")
touched_relationship_index = graph_db.get_or_create_index(neo4j.Relationship, "Touched_Relationship")
get_rel = current_relationship_index.get(some_key1, some_value1)
if len(get_rel) == 1:
    batch.add_indexed_relationship(touched_relationship_index, some_key2, some_value2, get_rel[0])
elif len(get_rel) == 0:
    created_rel = current_relationship_index.create(some_key3, some_value3, (my_start_node, "KNOWS", my_end_node))
    batch.add_indexed_relationship(touched_relationship_index, some_key4, "touched", created_rel)
batch.submit()

Is there a way to replace current_relationship_index.get(..) and current_relationship_index.create(...) with a batch command? I know that there is one, but the problem is, that I need to act depending on the return of these commands. And I would like to have all statements in a batch due to performance.

I have read that it is rather uncommon to index relationships but the reason I do it is the following: I need to parse some (text) file everyday and then need to check if any of the relations have changed towards the previous day, i.e. if a relation does not exist in the text file anymore I want to mark it with a "replaced" property in neo4j. Therefore, I add all "touched" relationships to the appropriate index, so I know that these did not change. All relations that are not in the touched_relationship_index obviously do not exist anymore so I can mark them.

I can't think of an easier way to do so, even though I'm sure that py2neo offers one.

EDIT: Considering Nigel's comment I tried this:

my_rel = batch.get_or_create_indexed_relationship(current_relationship_index, some_key, some_value, my_start_node, my_type, my_end_node)
batch.add_indexed_relationship(touched_relationship_index, some_key2, some_value2, my_rel)
batch.submit()

This obviously does not work, because i can't refer to "my_rel" in the batch. How can I solve this? Refer with "0" to the result of the previous batch statement? But consider that the whole thing is supposed to run in a loop, so the numbers are not fixed. Maybe use some variable "batch_counter" which refers to the current batch statement and is always incremented, whenever a statement is added to the batch??

Was it helpful?

Solution

Have a look at WriteBatch.get_or_create_indexed_relationship. That can conditionally create a relationship based on whether or not one currently exists and operates atomically. Documentation link below:

http://book.py2neo.org/en/latest/batches/#py2neo.neo4j.WriteBatch.get_or_create_indexed_relationship

There are a few similar uniqueness management facilities in py2neo that I recently blogged about here that you might want to read about.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top