Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine two merges into one in the Cypher language

I'm trying to make a Cypher which merges more edges than I can manage to put into the ASCII art of the Cypher language.

TLDR; How to accomplish this merge:

MERGE (a)-[:REL1]->(b:B)-[:REL2]->(c), (b)-[:REL3]->(d)

I have these simplified cypher queries to describe the problem:

// ensure required nodes exists
MATCH (a:A {id: "<uuid1>"})
MATCH (c:C {id: "<uuid2>"})
MATCH (d:D {id: "<uuid3>"})

// Make B connect the nodes
MERGE (a)-[:REL1]->(b:B)-[:REL2]->(c)
MERGE (b)-[:REL3]->(d) // <- thats the main problem - a seperate merge to make this relation, but it should be part of the first merge.

// Conclude
RETURN a,b,c,d

This query will work, but when it's being called multiple times, b is being reused. By that I mean, that multiple of these relations is made from the same b: (b)-:REL3->(d). That's not allowed in my system, since I should be able to delete b, and only affect exactly what was created by the first call.

To ensure that b is unique then, I could do this:

// ensure required nodes exists
MATCH (a:A {id: "<uuid1>"})
MATCH (c:C {id: "<uuid2>"})
MATCH (d:D {id: "<uuid3>"})

// ensure unique B
CREATE (b:B)

// Make B connect the nodes
MERGE (a)-[:REL1]->(b)-[:REL2]->(c)
MERGE (b)-[:REL3]->(d)

// Conclude
RETURN a,b,c,d

The problem with this one is, that each time it's called, a new B-node is created, even if the path already existed. Now that's just duplicated data, and I dont want that either.

I could fix that by adding a WITH/WHERE statement

// ensure required nodes exists
MATCH (a:A {id: "<uuid1>"})
MATCH (c:C {id: "<uuid2>"})
MATCH (d:D {id: "<uuid3>"})

OPTIONAL MATCH (a)-[:REL1]->(existingB:B)-[:REL2]->(c)
OPTIONAL MATCH (b)-[:REL3]->(d)

WITH a,exisingB,c,d
WHERE existingB is null // query ends here and I end up with zero rows returned

// ensure unique B
CREATE (b:B)

// Make B connect the nodes
MERGE (a)-[:REL1]->(b)-[:REL2]->(c)
MERGE (b)-[:REL3]->(d)

// Conclude
RETURN a,b,c,d

However, now the query doesn't return a,b,c,d - which I want it to.

So to sum it up, I want a query which:

  1. Always returns.
  2. Creates a new b node which combines a,c and d, if it doesn't already exist.
  3. If it does exist, find it and return it instead.

This is quite simple when dealing with simple merges: MATCH > MERGE > RETURN. The only thing that twists my mind, is the fact that I can't see how to do this with a single MERGE command.

As far as I can tell, combinations of multiple MERGE commands isn't possible, but I hope that someone has a solution for this.

UPDATED with real life example

Let's start by creating the required nodes in our access management example:

// create required nodes
CREATE (:Human {name:"Human A"})
CREATE (:Human {name:"Human B"})
CREATE (:Human {name:"Human C"})
CREATE (:Scope {name:"read:email"})

Now that should leave us with this: Required nodes for example

Now I want to grant Human A access to read:email on behalf of Human B:

// grant "Human A" access to "read:email" on behalf of "Human B" - aka let Human A read Human B's email address
MATCH (humanA:Human {name:"Human A"})
MATCH (readEmail:Scope {name:"read:email"})
MATCH (humanB:Human {name:"Human B"})

MERGE (humanA)-[:IS_GRANTED]->(gr:Grant:Rule)-[:GRANTS]->(readEmail)
MERGE (gr)-[:ON_BEHALF_OF]->(humanB)

This brings us in the following state: Human A granted read:email to Human B

So far, everything is good. I can rerun the query and the same exact state is kept.

Now I want Human A to have read:email to Huamn C as well. Same query, new "on behalf of".

// grant "Human A" access to "read:email" on behalf of "Human C" - aka let Human A read Human C's email address
MATCH (humanA:Human {name:"Human A"})
MATCH (readEmail:Scope {name:"read:email"})
MATCH (humanC:Human {name:"Human C"})

MERGE (humanA)-[:IS_GRANTED]->(gr:Grant:Rule)-[:GRANTS]->(readEmail)
MERGE (gr)-[:ON_BEHALF_OF]->(humanC)

Now the problem arrives: Reuse of grant rule problem

The Grant Rule is being reused, which is a problem for multiple reasons, but lets just state the obvious one: When I want to remove Human A's access to Human B's email, it's also removed to Human C, since they share the same rule.

Now one might say, why not merge "on behalf of" first to avoid this problem? Let's try and start over, but adding another scope "read:phone":

// create required nodes
CREATE (:Human {name:"Human A"})
CREATE (:Human {name:"Human B"})
CREATE (:Human {name:"Human C"})
CREATE (:Scope {name:"read:email"})
CREATE (:Scope {name:"read:phone"})

Required nodes for example

And try moving it around:

// grant "Human A" access to "read:email" on behalf of "Human B" - aka let Human A read Human B's email address
MATCH (humanA:Human {name:"Human A"})
MATCH (readEmail:Scope {name:"read:email"})
MATCH (humanB:Human {name:"Human B"})

MERGE (humanA)-[:IS_GRANTED]->(gr:Grant:Rule)-[:ON_BEHALF_OF]->(humanB)
MERGE (gr)-[:GRANTS]->(readEmail)

Just as last time, we end up with a correct state:

Human A granted read:email to Human B

Now I want to grant Human A access to Huamn B's read:phone aswell:

// grant "Human A" access to "read:phone" on behalf of "Human B" - aka let Human A read Human B's phone number
MATCH (humanA:Human {name:"Human A"})
MATCH (readPhone:Scope {name:"read:phone"})
MATCH (humanB:Human {name:"Human B"})

MERGE (humanA)-[:IS_GRANTED]->(gr:Grant:Rule)-[:ON_BEHALF_OF]->(humanC)
MERGE (gr)-[:GRANTS]->(readPhone)

Now this gives us this:

reusing of grant rule

That's incorrect. Now I can only delete all or nothing from Human A to Human B.

That was a lot, but I hope it gave some insights to the problem.

like image 737
Lasse Avatar asked Nov 13 '19 16:11

Lasse


1 Answers

[UPDATED (TWICE)]

This trick may work for your "3-legged-merge" (to coin a term). The second illustration in your question shows an example of the desired outcome of a 3-legged-merge, where a given Scope node has relationships to 3 specific nodes and ONLY those 3 nodes.

The trick is this: add 2 properties (or 3, see Notes below) to each Grant that uniquely identify the associated Scope, and the associated Human on whose behalf the grant is acting. This is admittedly redundant information if you also have relationships to the actual Scope and the Human nodes, but it should ensure that you can use MERGE to create a unique Grant node for each unique set of 3 legs.

For example, to properly perform the second query (in your update), assuming that name values are unique:

MATCH (humanA:Human {name:"Human A"})
MATCH (readEmail:Scope {name:"read:email"})
MATCH (humanC:Human {name:"Human C"})

MERGE (humanA)-[:IS_GRANTED]->(g:Grant:Rule {for: humanC.name, scope: readEmail.name})
MERGE (g)-[:ON_BEHALF_OF]->(humanC)
MERGE (g)-[:GRANTS]->(readEmail)

Notes:

  • The first MERGE ensures the g node is only associated with "Human A", so there is no need to add a third property to g with the unique identifier for "Human A" -- if and only if you always start your 3-legged-merge with the IS_GRANTED relationship.

    However, if you could sometimes create Grant nodes starting with one of the other "legs", then you'd need to have a property for each leg.

  • You will have to keep the Grant properties even after you create the associated relationships, so that future 3-legged merges will work properly.

  • Strictly speaking, you actually do not need to perform either of the last 2 MERGEs, since the Grant node will contain enough information to get the missing (virtual) legs dynamically, as needed. For example, to get the Scopes involving "Human A" on behalf of "Human C":

    MATCH
      (:Human {name:"Human A"})-[:IS_GRANTED]->(g {for: "Human C"}),
      (scope:Scope {name: g.scope})
    RETURN scope
    

    This is less efficient than having the actual relationships but saves on storage. Creating appropriate indexes (e.g., on ":Scope(name)" in this case) would reduce the speed penalty.

like image 168
cybersam Avatar answered Oct 06 '22 01:10

cybersam