Cypher multiple OPTIONAL MATCH - Pattern Comprehension - COUNT DISTINCT

Question

I have read a lot of comments about OPTIONAL MATCH and Pattern Comprehesion, but I can't find a solution for my case.

I have a node (Account) in my Neo4j Database and I'd like to count the nodes which belongs to each account. The following code works with one or two optional matches, but the many optional matches produce a cross product and a timeout.

// Account
MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL

// User
MATCH (a)<-[:CREATED]-(u:User)

// Contact
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})

// Opportunity
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})

// Open Opportunity
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})

// Attribute
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})

// Sales Planning
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)

// Task
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})

// Event
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})

// Contract
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})

RETURN
a.id

u.name AS User_Name,
u.department AS User_Department,

COUNT(DISTINCT c.id) AS Contact_Count,
COUNT(DISTINCT o.id) AS Opportunity_Count,
COUNT(DISTINCT open.id) AS OpenOpp_Count,
COUNT(DISTINCT aa.id) AS Attribute_Count,
COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count,
COUNT(DISTINCT t.id) AS Task_Count,
COUNT(DISTINCT e.id) AS Event_Count,
COUNT(DISTINCT ct.id) AS Contract_Count

I can rewrite the query with a Pattern Compression, but then I just get back the non distinct ids in arrays. Is there a way to count the distinct values inside the arrays or another way how to count the values in pattern compression?

MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL

RETURN a.id,
[
[(a)<-[:CONTACT_OF]- (c:Contact{isDeleted: false}) | c.id],
[(a)<-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false}) | o.id],
[(a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false}) | open.id],
[(a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false}) | aa.id],
[(a) <-[:SALESPLAN_OF]- (s:Sales_Planning) | s.timeYear],
[(a) <-[:TASK_OF]- (t:Task{isDeleted: false}) | t.id],
[(a) <-[:EVENT_OF]- (e:Event{isDeleted: false}) | e.id],
[(a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false}) | ct.id]
]

If I made a formal mistake in my first stockoverflow post, I would appreciate feedback :)

Charchit Kapoor · Accepted Answer

The problem lies, in the RETURN statement, because you are calculating all the counts at the last, neo4j has to calculate the cartesian products. If you calculate each node count at each step, it will be much more optimal. Like this:

MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL
MATCH (a)<-[:CREATED]-(u:User)
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})
WITH a, u, COUNT(DISTINCT c.id) AS Contact_Count,
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})
WITH a, u, Contact_Count, COUNT(DISTINCT o.id) AS Opportunity_Count
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, COUNT(DISTINCT open.id) AS OpenOpp_Count
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, COUNT(DISTINCT aa.id) AS Attribute_Count
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count,COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, COUNT(DISTINCT t.id) AS Task_Count
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, Task_Count, COUNT(DISTINCT e.id) AS Event_Count
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})
RETURN
a.id, u.name AS User_Name, u.department AS User_Department, Contact_Count,
Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count,
Task_Count, Event_Count, COUNT(DISTINCT ct.id) AS Contract_Count

Cypher multiple OPTIONAL MATCH - Pattern Comprehension - COUNT DISTINCT

Tags:

list-comprehension

neo4j

cypher

Axel

1 Answers

Charchit Kapoor

Recent Activity

Donate For Us

Cypher multiple OPTIONAL MATCH - Pattern Comprehension - COUNT DISTINCT

Tags:

list-comprehension

neo4j

cypher

Axel

1 Answers

Charchit Kapoor

Related questions

Recent Activity

Donate For Us