I have some duplicate nodes, all with the label Tag. What I mean with duplicates is that I have two nodes with the same name property, example:
{ name: writing, _id: 57ec2289a90f9a2deece7e6d},
{ name: writing, _id: 57db1da737f2564f1d5fc5a1},
{ name: writing }
The _id
field is no longer used so in all effects these three nodes are the same, only that each of them have different relationships.
What I would like to do is:
Find all duplicate nodes (check)
MATCH (n:Tag)
WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
WHERE count > 1
RETURN name, nodelist, count
Copy all relationships from the duplicate nodes into the first one
Can this be achieved with cypher query? Or do I have to make a script in some programming language? (this is what I'm trying to avoid)
APOC Procedures has some graph refactoring procedures that can help. I think apoc.refactor.mergeNodes()
ought to do the trick.
Be aware that in addition to transferring all relationships from the other nodes onto the first node of the list, it will also apply any labels and properties from the other nodes onto the first node. If that's not something you want to do, then you may have to collect incoming and outgoing relationships from the other nodes and use apoc.refactor.to()
and apoc.refactor.from()
instead.
Here's the query for merging nodes:
MATCH (n:Tag)
WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist) YIELD node
RETURN node
The above cypher query didn't work on my Database version 3.4.16
What worked for me was:
MATCH (n:Tag)
WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist,{
properties:"combine",
mergeRels:true
})
YIELD node
RETURN node;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With