Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cypher multiple OPTIONAL MATCH - Pattern Comprehension - COUNT DISTINCT

I have read a lot of comments about OPTIONAL MATCH and Pattern Comprehesion, but I can't find a solution for my case.

I have a node (Account) in my Neo4j Database and I'd like to count the nodes which belongs to each account. The following code works with one or two optional matches, but the many optional matches produce a cross product and a timeout.

// Account
MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL

// User
MATCH (a)<-[:CREATED]-(u:User)

// Contact
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})

// Opportunity
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})

// Open Opportunity
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})

// Attribute
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})

// Sales Planning
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)

// Task
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})

// Event
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})

// Contract
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})

RETURN
a.id

u.name AS User_Name,
u.department AS User_Department,

COUNT(DISTINCT c.id) AS Contact_Count,
COUNT(DISTINCT o.id) AS Opportunity_Count,
COUNT(DISTINCT open.id) AS OpenOpp_Count,
COUNT(DISTINCT aa.id) AS Attribute_Count,
COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count,
COUNT(DISTINCT t.id) AS Task_Count,
COUNT(DISTINCT e.id) AS Event_Count,
COUNT(DISTINCT ct.id) AS Contract_Count

I can rewrite the query with a Pattern Compression, but then I just get back the non distinct ids in arrays. Is there a way to count the distinct values inside the arrays or another way how to count the values in pattern compression?

MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL

RETURN a.id,
[
[(a)<-[:CONTACT_OF]- (c:Contact{isDeleted: false}) | c.id],
[(a)<-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false}) | o.id],
[(a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false}) | open.id],
[(a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false}) | aa.id],
[(a) <-[:SALESPLAN_OF]- (s:Sales_Planning) | s.timeYear],
[(a) <-[:TASK_OF]- (t:Task{isDeleted: false}) | t.id],
[(a) <-[:EVENT_OF]- (e:Event{isDeleted: false}) | e.id],
[(a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false}) | ct.id]
]

If I made a formal mistake in my first stockoverflow post, I would appreciate feedback :)

like image 994
Axel Avatar asked Mar 11 '26 16:03

Axel


1 Answers

The problem lies, in the RETURN statement, because you are calculating all the counts at the last, neo4j has to calculate the cartesian products. If you calculate each node count at each step, it will be much more optimal. Like this:

MATCH (a:Account{billingCountry: "DE", isDeleted: false})
WHERE a.id IS NOT NULL
MATCH (a)<-[:CREATED]-(u:User)
OPTIONAL MATCH (a) <-[:CONTACT_OF]- (c:Contact{isDeleted: false})
WITH a, u, COUNT(DISTINCT c.id) AS Contact_Count,
OPTIONAL MATCH (a) <-[:OPPORTUNITY_OF]- (o:Opportunity{isDeleted: false, s4sMarked_For_Deletion__C: false})
WITH a, u, Contact_Count, COUNT(DISTINCT o.id) AS Opportunity_Count
OPTIONAL MATCH (a)<-[:OPPORTUNITY_OF]-(open:Opportunity{isClosed: false, isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, COUNT(DISTINCT open.id) AS OpenOpp_Count
OPTIONAL MATCH (a) <-[:ATTRIBUTE_OF]- (aa:Attribute_Assignment{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, COUNT(DISTINCT aa.id) AS Attribute_Count
OPTIONAL MATCH (a) <-[:SALESPLAN_OF]- (s:Sales_Planning)
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count,COUNT(DISTINCT s.timeYear) AS Sales_Plan_Count
OPTIONAL MATCH (a) <-[:TASK_OF]- (t:Task{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, COUNT(DISTINCT t.id) AS Task_Count
OPTIONAL MATCH (a) <-[:EVENT_OF]- (e:Event{isDeleted: false})
WITH a, u, Contact_Count, Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count, Task_Count, COUNT(DISTINCT e.id) AS Event_Count
OPTIONAL MATCH (a) <-[:CONTRACT_OF]- (ct:Contract{isDeleted: false})
RETURN
a.id, u.name AS User_Name, u.department AS User_Department, Contact_Count,
Opportunity_Count, OpenOpp_Count, Attribute_Count, Sales_Plan_Count,
Task_Count, Event_Count, COUNT(DISTINCT ct.id) AS Contract_Count
like image 82
Charchit Kapoor Avatar answered Mar 14 '26 10:03

Charchit Kapoor



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!