What is the best way to cleanup the graph from all nodes and relationships via Cypher?
At http://neo4j.com/docs/stable/query-delete.html#delete-delete-a-node-and-connected-relationships the example
MATCH (n)
OPTIONAL MATCH (n)-[r]-()
DELETE n,r
has the note:
This query isn’t for deleting large amounts of data
So, is the following better?
MATCH ()-[r]-() DELETE r
and
MATCH (n) DELETE n
Or is there another way that is better for large graphs?
As you've mentioned the most easy way is to stop Neo4j, drop the data/graph.db
folder and restart it.
Deleting a large graph via Cypher will be always slower but still doable if you use a proper transaction size to prevent memory issues (remember transaction are built up in memory first before they get committed). Typically 50-100k atomic operations is a good idea. You can add a limit to your deletion statement to control tx sizes and report back how many nodes have been deleted. Rerun this statement until a value of 0 is returned back:
MATCH (n)
OPTIONAL MATCH (n)-[r]-()
WITH n,r LIMIT 50000
DELETE n,r
RETURN count(n) as deletedNodesCount
According to the official document here:
MATCH (n)
DETACH DELETE n
but it also said This query isn’t for deleting large amounts of data
. so it's better use with limit.
match (n)
with n limit 10000
DETACH DELETE n;
Wrote this little script, added it in my NEO/bin folder.
Tested on v3.0.6 community
#!/bin/sh
echo Stopping neo4j
./neo4j stop
echo Erasing ALL data
rm -rf ../data/databases/graph.db
./neo4j start
echo Done
I use it when my LOAD CSV imports are crappy.
Hope it helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With