Question

I am attempting to generate a database using MERGE statements through Neo4JPHP. All of my queries are using MERGE; however, it is generating separate nodes every time, resulting in massive duplication.

The queries are run within a single transaction. I've removed the surrounding code to focus on the queries:

$transaction = $client->beginTransaction();

while(...) {
    $pq = new Query($client, 'MERGE (n:Page {url:"'.$page.'"}) SET n.title="'.$title.'"');
    $transaction->addStatements(array($pageQuery));

    $h1Query = new Query($client, 'MATCH (n:Page {url:"'.$page.'"}) SET n.h1s = "['.implode(", ", $h1s).']"');
    $transaction->addStatements(array($h1Query));

    $scriptQuery = new Query($client, 'MATCH (n:Page {url:"'.$page.'"}) MERGE (n)-[:CONTAINS_SCRIPT]->(s:Script {url:"'.$s.'"})');
    $transaction->addStatements(array($scriptQuery));

    $styleQuery = new Query($client, 'MATCH (n:Page {url:"'.$page.'"}) MERGE (n)-[:CONTAINS_STYLESHEET]->(s:StyleSheet {url:"'.$s.'"})');
    $transaction->addStatements(array($styleQuery));

    $otherPageQuery = new Query($client, 'MATCH (n:Page {url:"'.$page.'"}) MERGE (n)-[:LINKS_TO]->(m:Page {url:"'.$match.'"})');
    $transaction->addStatements(array($otherPageQuery));
}

$transaction->commit();

Now, after running this across a couple of pages, it comes up with 6 copies of the same Pages, one with title and h1s elements, and the rest without.

I also tried using CREATE UNIQUE, but this gave an error that the syntax wasn't supported.

I am running Neo4j 2.0.1. Any suggestions?

Was it helpful?

Solution

When you use MERGE on matches with relationships in Cypher, the entire object is being matched or created. When a match cannot be found, the entire object is created.

For example:

MERGE (n:Page { url: "http://www.neo4j.org" })
RETURN n

Gets or creates the Page with the property url set to http://www.neo4j.org. This statement will never create a duplicate node.

Now let's assume that this node now exists within the Neo4j database and then we run the following query:

MERGE (n:Page { url: "http://www.neo4j.org" })-[:CONNECTED_TO]->(test:Test { id: "test" })
RETURN *

This will attempt to match the entire pattern and if it does not exist, it will create the entire path regardless of whether or not the Page node exists.

To resolve your issue, make sure that you use MERGE to get or create your individual nodes first. Then you can use MERGE to get or create the relationship between the two nodes.

Example:

MERGE (n:Page { url: "http://www.neo4j.org" })
MERGE (s:StyleSheet { url: "http://www.neo4j.org/styles/main.css" })
MERGE (n)-[:CONTAINS_STYLESHEET]->(s)
RETURN *
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top