Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neo4j Shortest Path for specific node labels in-between the path

I'm looking for a hint to solve the following issue:

Given a Neo4j DB with two node types applied as labels

:Address and :Milestone (one for each node)

Between the different nodes there are directed :LEADS_TO relationships.

Given the following pattern (I skipped the [ ] for the relations):

Pattern 1:

(a:Address)->(:Milestone)->(:Milestone)<-(:Milestone)<-(:Milestone)<-(b:Address)

Pattern 2:

(a:Address)->(:Milestone)->(c:Address)<-(:Milestone)<-(b:Address)

I'm looking for a query that only includes "in-between" nodes of the same type within the shortestPath analysis. In my case it should only include nodes with the label :Milestone and skips shorter paths that have a node labeled :Address somewhere in-between.

In other words: the query should match Pattern 1 and not Pattern 2 even if it is shorter.

I found the following solution which comes close but fails in my case because it covers the start and end node label as well and due to this selects no rows at all:

MATCH p=shortestPath( (a:Address)-[:LEADS_TO*..10]-(b:Address) ) 
WHERE a.name = 'XYZ'
AND b.name = 'ABC'
AND ALL(x IN nodes(p) WHERE (x:MILESTONE))
RETURN p

My initial idea was to create a first hop:

For the Start node

MATCH (a:Address)-[:LEADS_TO]->(aa:Milestone)

and similar for the end node and start the query from there.

But this is problematic because every :Address node can have multiple :LEADS_TO relationships. Querying the shortest path for a defined start and end node this approach would create n start nodes and m end nodes just for one query that could be executed with a distinct start and end node.

Does anyone have an idea how to exclude the start and end node from the WHERE clause within Cypher?

Any hint is greatly appreciated.

like image 464
Krid Avatar asked Jun 16 '16 17:06

Krid


2 Answers

Try this, you can use a subset of a collection with a slice operator

MATCH p=shortestPath( (a:Address)-[:LEADS_TO*..10]-(b:Address) ) 
WHERE a.name = 'XYZ'
AND b.name = 'ABC'
AND ALL(x IN nodes(p)[1..-1] WHERE (x:MILESTONE))
RETURN p
like image 91
Michael Hunger Avatar answered Jan 22 '23 18:01

Michael Hunger


You can use a WHERE ALL clause based on the relationship type and the nodes not being start or end, for eg:

MATCH p=shortestPath( (a:Address)-[:LEADS_TO*..10]-(b:Address) ) 
WHERE a.name = 'XYZ'
AND b.name = 'ABC'
AND ALL( x IN range( 1, size(nodes(p)-2) ) 
         WHERE 'Milestone' IN labels(nodes(p)[x]) )
like image 28
Christophe Willemsen Avatar answered Jan 22 '23 18:01

Christophe Willemsen