Question

I'm using py2neo 1.6.4 and neo4j 2.0.1 and I'm finding some oddities with accessing indexed nodes. In particular, an indexed node accessed by index does not return the same object as that node accessed by id.

for example:

>>> graph_db.get_or_create_indexed_node('index','key',1)
Node('http://localhost:7474/db/data/node/1')
>>> graph_db.get_indexed_node('index','key',1)
Node('http://localhost:7474/db/data/node/1')   #get works fine after create
>>> graph_db.get_indexed_node('index','key',1).exists
True                                 #the node exists in the db
>>> graph_db.get_indexed_node('index','key',1)._id
1                                    #the id for the node
>>> graph_db.node(1)
Node('http://localhost:7474/db/node/ #note that this is different than the query on the index
>>> graph_db.node(1).exists
False                                #node does not exist in db when accessed by id

so the node returned when accessed by id does not actually exist in the database, even though the id returned is exactly that assigned to the indexed node.

I am fairly new to both neo4j and py2neo and do not have an overwhelmingly sophisticated understanding of indexing, so if there is an answer which could help to educate me and others that would be fantastic, and if this represents a bug that would be nice to know as well :)

thanks!

Was it helpful?

Solution

I'm not completely familiar with how py2neo determines if a node exists in the database or not, but you may want to try to use the new indexes introduced in Neo4j 2.0.0. The indexes you are using here are the legacy indexes which require you to manually keep them up to date, and there are several caveats around their operation. The new indexes are automatically kept up to date and work more as an optimization for your queries, in the same manner indexes work in relational databases.

I'm not sure how or if py2neo exposes these indexes directly, but you can access them through py2neos cypher API. It's generally a better idea to use the cypher query language when working against the neo4j server, since it allows you to send larger chunks of domain work to be done in the database, rather than pulling data out one http call at a time and doing the work on the client side.

For instance:

from py2neo import cypher

session = cypher.Session("http://localhost:7474")
tx = session.create_transaction()

# Create an index
tx.append("CREATE INDEX ON :User(name)")
tx.commit()

# Query that will use the index for lookup
tx = session.create_transaction()
tx.append("MATCH (n:User) WHERE n.name='Cat Stevens' RETURN n")
results = tx.execute()
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top