Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimise a Neo4j MERGE query on a node with many relationships?

Tags:

neo4j

cypher

I have a graph with a node that has many outgoing relationships. The time it takes to add new outgoing relationships degrades as I add more relationships. The degradation appears to be due to the time taken to check that the relationship doesn't already exist (I'm using MERGE to add the relationships).

The destination nodes of the relationships have very few relationships themselves. Is there any way I can force Neo4j check for the existence of the relationship from the destination node instead of from the source node?

Here's test script to reproduce the problem. It creates one node with id 0 followed by 1000 nodes connected to node 0 by the HAS relationship. As nodes are added the execution time increases linearly.

CREATE CONSTRAINT ON (n:Node) ASSERT n.id IS UNIQUE

UNWIND RANGE(1,1000) AS i
MERGE (from:Node { id: 0 })
MERGE (to:Node { id: i})
MERGE (from)-[:HAS]->to

Added 1001 labels, created 1001 nodes, set 1001 properties, created 1000 relationships, statement executed in 3496 ms.

UNWIND RANGE(1001,2000) AS i
MERGE (from:Node { id: 0 })
MERGE (to:Node { id: i})
MERGE (from)-[:HAS]->to

Added 1000 labels, created 1000 nodes, set 1000 properties, created 1000 relationships, statement executed in 7030 ms.

UNWIND RANGE(2001,3000) AS i
MERGE (from:Node { id: 0 })
MERGE (to:Node { id: i})
MERGE (from)-[:HAS]->to

Added 1000 labels, created 1000 nodes, set 1000 properties, created 1000 relationships, statement executed in 10489 ms.

UNWIND RANGE(3001,4000) AS i
MERGE (from:Node { id: 0 })
MERGE (to:Node { id: i})
MERGE (from)-[:HAS]->to

Added 1000 labels, created 1000 nodes, set 1000 properties, created 1000 relationships, statement executed in 14390 ms.

If CREATE is used instead of MERGE the performance is much better. I can't use CREATE though because I want to ensure the relationships are unique.

UNWIND RANGE(4001,5000) AS i
MERGE (from:Node { id: 0 })
MERGE (to:Node { id: i})
CREATE (from)-[:HAS]->to

Added 1000 labels, created 1000 nodes, set 1000 properties, created 1000 relationships, statement executed in 413 ms.

Note: Tested with Neo4j v2.2.2

like image 426
Dave Avatar asked Jun 19 '15 05:06

Dave


People also ask

How to create nodes in Neo4j?

We will go through some neo4j samples and some scenarios. Let’s begin with the basics. Query To Create Simple Nodes: Above query will create nodes of type Actor. Query To add a new property to existing Node: MATCH (actor:Actor) SET actor.friendsCount = 0 RETURN actor;

How to create a relationship In Neo4j CQL?

Neo4j CQL - Creating a Relationship 1 Create relationships 2 Create a relationship between the existing nodes 3 Create a relationship with label and properties More ...

What is Merge node in Node JS?

Merge on a relationship between an existing node and a merged node derived from a node property 1. Introduction MERGE either matches existing nodes and binds them, or it creates new data and binds that. It’s like a combination of MATCH and CREATE that additionally allows you to specify what happens if the data was matched or created.

What is a relationship in noe4j?

In Noe4j, a relationship is an element using which we connect two nodes of a graph. These relationships have direction, type, and the form patterns of data. This chapter teaches you how to −


Video Answer


1 Answers

This is because cypher is not clever enough yet to use the degree of the nodes when executing merge. In the COST optimizer which is used for reads it is already cleverer but for updates the old RULE optimizer is used.

After playing around with it for a bit unsuccessfully * changing the order of from & to * using CREATE UNIQUE instead of MERGE * trying to use path-expressions which use get-degree in COST

I remembered that shortestPath actually takes degree's into account and also goes from left to right

So I tried to combine that with CREATE, and it worked really well, here is an example for 100.000 nodes.

neo4j-sh (?)$ CREATE CONSTRAINT ON (n:Node) ASSERT n.id IS UNIQUE;
+-------------------+
| No data returned. |
+-------------------+
Constraints added: 1
1054 ms
neo4j-sh (?)$ 
neo4j-sh (?)$ UNWIND RANGE(0,100000) AS i CREATE (to:Node { id: i});
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 100001
Properties set: 100001
Labels added: 100001
2375 ms
neo4j-sh (?)$ 
neo4j-sh (?)$ 
neo4j-sh (?)$ MATCH (from:Node { id: 0 })
> UNWIND RANGE(1,100000) AS i
> MATCH (to:Node { id: i})
> WHERE shortestPath((to)<-[:HAS]-(from)) IS NULL
> CREATE (from)-[:HAS]->(to);
+-------------------+
| No data returned. |
+-------------------+
Relationships created: 100000
2897 ms
neo4j-sh (?)$ 
neo4j-sh (?)$ 
neo4j-sh (?)$ MATCH (from:Node { id: 0 })
> UNWIND RANGE(1,100000) AS i
> MATCH (to:Node { id: i})
> WHERE shortestPath((to)<-[:HAS]-(from)) IS NULL
> CREATE (from)-[:HAS]->(to);
+--------------------------------------------+
| No data returned, and nothing was changed. |
+--------------------------------------------+
2360 ms
like image 68
Michael Hunger Avatar answered Oct 20 '22 05:10

Michael Hunger