Question

I'm learning Neo4J and my toy project is to play with Twitter. In this little script I'm using python tweepy and py2neo to take one twitter_user and insert all of their friends.

def insert_friends(twitter_user):
    for friend in Cursor(api.friends, user_id=twitter_user.id_str).items():
        n=neo4j.CypherQuery(graph_db,"""
                MATCH (user),(friend)
                WHERE user.id_str={user_id_str} AND friend.id_str={friend_id_str}
                CREATE UNIQUE (user)-[:FOLLOWS]->(friend)
        """).execute_one(user_id_str=twitter_user.id_str, friend_id_str=friend.id_str)

This works fine, but I suspect it can be optimized. Namely, in the WHERE clause, I'm looking up the same user.id each time. How do I avoid that extra lookup each time? For instance, is there anyway I could a priori figure out which node it is in Neo4J and just specify the Neo4J internal node id?

Was it helpful?

Solution

You need to use labels and indexes!

Namely:

CREATE INDEX on :User(id_str);

MATCH (user:User),(friend:User) // add labels so it knows to use the index
WHERE user.id_str={user_id_str} AND friend.id_str={friend_id_str}
CREATE UNIQUE (user)-[:FOLLOWS]->(friend);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top