Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get nodes that don't have certain relationship (cypher/neo4j)

I have the following two node types:

c:City {name: 'blah'}
s:Course {title: 'whatever', city: 'New York'}

Looking to create this:

(s)-[:offered_in]->(c)

I'm trying to get all courses that are NOT tied to cities and create the relationship to the city (city gets created if doesn't exist). However, the issue is that my dataset is about 5 million nodes and any query i make times out (unless i do in increment of 10k).

... anybody has any advice?

EDIT:

Here is a query for jobs i'm running now (that has to be done in 10k chunks (out of millions) because it takes few minutes as it is. creates city if doesn't exist):

match (j:Job)
where not has(j.merged) and has(j.city)
WITH j 
LIMIT 10000
MERGE (c:City {name: j.city})
WITH j, c
MERGE (j)-[:in]->(c)
SET j.merged = 1
return count(j)

(for now don't know of a good way to filter out the ones already matched, so trying to do it by tagging it with custom "merged" attribute that i already have an index on)

like image 497
Diaspar Avatar asked Sep 04 '14 17:09

Diaspar


1 Answers

500000 is a fair few nodes and on your other question you suggested 90% were without the relationship that you want to create here, so it is going to take a bit of time. Without more knowledge of your system (spec, neo setup, programming environment) and when you are running this (on old data or on insert) this is just a best guess at a tidier solution:

MATCH (j:Job)
WHERE NOT (j)-[:IN]->() AND HAS(j.city)
MERGE (c:City {name: j.city})
MERGE (j)-[:IN]->(c)
return count(j)

Obviously you can add your limits back as required.

like image 125
JohnMark13 Avatar answered Oct 04 '22 18:10

JohnMark13