Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Neo4j Cypher: Merge duplicate nodes

Tags:

neo4j

cypher

I have some duplicate nodes, all with the label Tag. What I mean with duplicates is that I have two nodes with the same name property, example:

{ name: writing, _id: 57ec2289a90f9a2deece7e6d},
{ name: writing, _id: 57db1da737f2564f1d5fc5a1},
{ name: writing }

The _id field is no longer used so in all effects these three nodes are the same, only that each of them have different relationships.

What I would like to do is:

  1. Find all duplicate nodes (check)

    MATCH (n:Tag)
    WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
    WHERE count > 1
    RETURN name, nodelist, count
    
  2. Copy all relationships from the duplicate nodes into the first one

  3. Delete all the duplicate nodes

Can this be achieved with cypher query? Or do I have to make a script in some programming language? (this is what I'm trying to avoid)

like image 608
Juan Fuentes Avatar asked Dec 06 '22 15:12

Juan Fuentes


2 Answers

APOC Procedures has some graph refactoring procedures that can help. I think apoc.refactor.mergeNodes() ought to do the trick.

Be aware that in addition to transferring all relationships from the other nodes onto the first node of the list, it will also apply any labels and properties from the other nodes onto the first node. If that's not something you want to do, then you may have to collect incoming and outgoing relationships from the other nodes and use apoc.refactor.to() and apoc.refactor.from() instead.

Here's the query for merging nodes:

MATCH (n:Tag)
WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist) YIELD node
RETURN node
like image 67
InverseFalcon Avatar answered Dec 30 '22 12:12

InverseFalcon


The above cypher query didn't work on my Database version 3.4.16

What worked for me was:

MATCH (n:Tag)
WITH n.name AS name, COLLECT(n) AS nodelist, COUNT(*) AS count
WHERE count > 1
CALL apoc.refactor.mergeNodes(nodelist,{
  properties:"combine",
  mergeRels:true
})
YIELD node
RETURN node; 
like image 34
freshNfunky Avatar answered Dec 30 '22 12:12

freshNfunky