Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cypher: how to find all the chains of single nodes not repeated?

I want to find ALL the paths starting and ending from/to a specific node. I would like that each node in a path appears only once.

For example, in a graph like this:

(a)-[:REL]->(b)-[:REL]->(c)-[:REL]->(a)
(a)-[:REL]->(e)-[:REL]->(f)-[:REL]->(a)
(e)-[:REL]->(b)

grafically:

  e → b
 ↙ ↖ ↗ ↘
f → a ← c

Cypher code:

CREATE (a { name:'A' })-[:REL]->(b {name:'B'})-[:REL]->(c { name:'C' })
       -[:REL]->(a)-[:REL]->(e {name:'E'})-[:REL]->(f {name:'F'})-[:REL]->(a),
       (e)-[:REL]->(b)

I would like that the research of chains starting from (a) returns

(a)->(b)->(c)->(a)

(a)->(e)->(f)->(a)

(a)->(e)->(b)->(c)->(a)

while starting from (f) returns only

(f)->(a)->(e)->(f)

and NOT

(f)->(a)->(b)->(c)->(a)->(e)->(f)

because it passes twice through the node (a).

I tryied:

MATCH p=(a {name:'F'})-[:REL*1..]->(a) 
WHERE SINGLE(e1 IN TAIL(NODES(p)) WHERE SINGLE(e2 IN TAIL(NODES(p)) WHERE e1=e2))
RETURN p

but I've got no result. The best I've been able to reach is to have no repetition of just the start node with this query:

MATCH p=(a {name:'F'})-[:REL*1..]->(a) 
WHERE SINGLE(e IN TAIL(NODES(p)) WHERE e=a)
RETURN p

but obviously it's not what I want because it returns also

(f)->(a)->(b)->(c)->(a)->(e)->(f)

that is a path involving the node (a) twice.

Can someone suggest me a solution?

Thank you in advance.

P.S. The version of Neo4j that I use is 2.0

like image 930
mdalp Avatar asked Sep 30 '22 17:09

mdalp


2 Answers

On the shoulders of those who came before I have come up with:

MATCH (a { name:'A' })
MATCH (a)-[p:REL]->(s)
WITH a, s
MATCH p = s-[r:REL*]->a
WITH [a] + nodes(p) AS ns
    WHERE ALL (n IN ns 
       WHERE 1=LENGTH(FILTER(m IN TAIL(ns) 
                             WHERE m = n)))
RETURN ns

It utilises the idea from @FrobberOfBits to perform the initial steps and then the query linked from Michael to perform the filter on the paths. I suppose that the TAIL could be left out, if you also left out:

WITH [a] + nodes(p) AS ns    

Which would make the query this maybe:

MATCH (a { name:'A' })
MATCH (a)-[p:REL]->(s)
WITH a, s
MATCH p = s-[r:REL*]->a
WITH a, nodes(p) AS ns
WHERE ALL (n IN ns 
       WHERE 1=LENGTH(FILTER(m IN ns 
                             WHERE m = n)))
RETURN [a] + ns

The complicated or at least slightly hard to read part of the query is the ALL in combination with the FILTER which "basically" for all values in ns (WHERE ALL (n in ns) counts the number of occurences by comparing to all other values in the array (m IN ns WHERE m = n) and filters the result out completely if the resultant array does not contain a single element (1=LENGTH).

I added your example in the console here.

like image 91
JohnMark13 Avatar answered Oct 03 '22 18:10

JohnMark13


So here is some data mocked up to copy your example:

$ create (a:T {label:"A"})-[:REL]->(b:T)-[:REL]->(c:T)-[:REL]->(a); 
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 3
Relationships created: 3
Properties set: 1
Labels added: 3
20 ms
neo4j-sh (?)$ match (a:T {label:"A"})                                          
> CREATE a-[:REL]->(e:T)-[:REL]->(f:T)-[:REL]->(a);
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 2
Relationships created: 3
Labels added: 2
27 ms

Now here's the query that will get you there (but it requires some explanation).

match (a:T {label: "A"})-[:REL]->(somethingElse),
      p=shortestPath((somethingElse)-[:REL*]->(a))
return p;

OK so basically it seems to me like you want the shortest path from (a) back to itself. But that's tricky, because the shortest path from a node to itself is always the empty path, of length 0. So if you do something like shortestPath((a)-[:REL*]->(a)) the result will always be empty which isn't what you want.

So instead, I matched from one :REL hop to the next item (somethingElse) and then did the shortest path from that back to A. The resulting returned path p is exactly the path you want, except that it's missing the original hop from a-->somethingElse. As long as you take that extra hop into account, then the combination of a-->somethingElse with p should be the answer you're looking for.

like image 32
FrobberOfBits Avatar answered Oct 03 '22 19:10

FrobberOfBits