Question

I've been writing a cypher query to count all nodes with a label, property, or relationship that contain the criteria typed in by the user. When I do this query, I want a count of all the nodes with unique label names (which I've accomplished). I also want a count of all the nodes that contain specific relationships.

This is the part of the query that returns a count of all the nodes with the labels: Fruit, Lesson, Tech, and TestData. I left out the WHERE clause because it's pretty long.

match (n)
return 
sum(CASE when any(l IN labels(n) WHERE l='Fruit') THEN 1 ELSE 0 END) AS Fruit,      
sum(CASE when any(l IN labels(n) WHERE l='Lesson') THEN 1 ELSE 0 END) AS Lesson,
sum(CASE when any(l IN labels(n) WHERE l='Tech') THEN 1 ELSE 0 END) AS Tech,
sum(CASE when any(l IN labels(n) WHERE l='TestData') THEN 1 ELSE 0 END) AS TestData

It returns

Fruit       Lesson         Tech         TestData
1000        20              100         50

However, I'd also like to count the number of nodes that had specific relationships (I know the names ahead of time) like "KNOWS", "IS_A", and "DESTINATION." For instance, if a user was searching for the word "knows" and the resulting nodes had a relationship called "knows," then I would count that node. Afterwards, my query would report that I had found 20 nodes that were connected via the "knows" relationship.

I'd like to do this without excluding any of my result nodes. Notice I didn't include any relationships in the match clause. I still want to include nodes that don't have a relationship (I don't care about counting that).

Does anyone know how to do this? Can it be done?

I'm looking for something similar to this:

match (n)
return
sum(CASE when any(r IN rels(n) WHERE r='KNOWS') THEN 1 ELSE 0 END) AS KNOWS,      
sum(CASE when any(r IN rels(n) WHERE r='IS_A') THEN 1 ELSE 0 END) AS IS_A,
sum(CASE when any(r IN rels(n) WHERE r='DESTINATION') THEN 1 ELSE 0 END) AS DESTINATION,
Was it helpful?

Solution

Not sure if I catch what you mean with "count the nodes that matched specific relationships" and "without excluding any of my result nodes". Do you mean to count cases where a node has at least one relationship in any direction of each type (and if for instance the node has two or five such relationships it still counts as one), as opposed to counting the relationships? Or do you mean that you want a sub-query that works in combination with the previous query counting labels? Does this do what you want?

MATCH (n)
OPTIONAL MATCH (n)-[r]-()
WITH n, collect(r) as rs
SUM(CASE WHEN ANY(l IN labels(n) WHERE l='Fruit') THEN 1 ELSE 0 END) AS Fruit,
SUM(CASE WHEN ANY(l IN labels(n) WHERE l='Lesson') THEN 1 ELSE 0 END) AS Lesson,
SUM(CASE WHEN ANY(l IN labels(n) WHERE l='Tech') THEN 1 ELSE 0 END) AS Tech,
SUM(CASE WHEN ANY(l IN labels(n) WHERE l='TestData') THEN 1 ELSE 0 END) AS TestData,
SUM(CASE WHEN ANY(r IN rs WHERE type(r)='KNOWS') THEN 1 ELSE 0 END) AS KNOWS,
SUM(CASE WHEN ANY(r IN rs WHERE type(r)='IS_A') THEN 1 ELSE 0 END) AS IS_A,
SUM(CASE WHEN ANY(r IN rs WHERE type(r)='DESTINATION') THEN 1 ELSE 0 END) AS DESTINATION

Something that you could try, that may be an improvement, would be to match each thing you want to count explicitly, count it, and carry the count with WITH until you return. This has particular benefits if there are many things in your database that you are not counting, which on the query above will be matched and evaluated anyway. So you could experiment with something like

MATCH (n:Fruit)
WITH count(n) as Fruit
MATCH (n:'Lesson')
WITH Fruit, count(n) as Lesson
MATCH (n:'Tech')
WITH Fruit, Lesson, count(n) as Tech
MATCH (n:TestData)
WITH Fruit, Lesson, Tech, count(n) as TestData
MATCH (n)-[:KNOWS]-()
WITH Fruit, Lesson, Tech, TestData, count(n) as KNOWS
MATCH (n)-[:IS_A]-()
WITH Fruit, Lesson, Tech, TestData, KNOWS, count(n) as IS_A
MATCH (n)-[:DESTINATION]-()
RETURN Fruit, Lesson, Tech, TestData, KNOWS, IS_A, count(n) as DESTINATION

I don't know how much of a difference it will make on your data, but it's better as far as possible to use narrow or specific patterns rather than to use a broad pattern and then filter. It may be worth profiling the two queries and compare.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top