Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Deleting all nodes and relationships in neo4j using cypher exceeds heap space

Tags:

neo4j

cypher

I have been trying to run this query as recommended in the neo4j google group and in other sources online:

START n = node(*) MATCH n-[r?]-() WHERE ID(n)>0 DELETE n, r;

in order to delete all nodes and relationships between tests. When I do so from the console, I run out of java heap space. When I do so from python (using the newish graph_db.clear(), which appears uses the same query), I get a "SystemError: None" which, I assume, is the same java heap space error. I have a database with 500k nodes, only 5k relationships, and 7M properties. I am running on a Mac laptop (10.6.8) with 8GB RAM using neo4j-1.8.1. I guess I am a bit surprised that deleting nodes (with essentially no relationships, so very small subgraphs) would exceed the java heap space, but I am pretty naive about how neo4j works. Any suggestions in how to go forward are appreciated. I do know that rm -rf in the data directory and starting from scratch will work, but I thought there might be a less-drastic solution.

[cross-posted to neo4j google groups]

like image 506
seandavi Avatar asked Feb 04 '13 16:02

seandavi


2 Answers

I found a better solution in the Neo4J knowledge base [1]:

CALL apoc.periodic.iterate(
    "MATCH (n) RETURN n",
    "DETACH DELETE n",
    {batchSize:1000}
)
YIELD batches, total RETURN batches, total

[1] - https://neo4j.com/developer/kb/large-delete-transaction-best-practices-in-neo4j/

like image 194
antimirov Avatar answered Oct 10 '22 00:10

antimirov


The cypher statement above causes all nodes (besides the root node with ID 0) to be instantiated before deletion in one single transaction. This eats up too much memory when done with 500k nodes.

Try to limit the number of nodes to delete to something around 10k-50k, like e.g.:

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<10000) 
DELETE n, r;

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<20000) 
DELETE n, r;

etc.

However, there's nothing wrong with removing the entire database directory, it's good practice.

like image 45
Axel Morgner Avatar answered Oct 10 '22 02:10

Axel Morgner