Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to add multiple values to existing nodes with Cypher in Neo4J

I'm trying to load some data in Neo4J. I have a Person node which is already setup. Now, this node needs to have an email property which should be an array(or collection). Basically, the email property needs to have multiple values, like -

email: ["[email protected]", "[email protected]"]

I've come across similar questions here but all of the answers indicate to setting multiple property values at the time the node itself is created. Like this query from this answer -

CREATE (e:Employee { name:"Sam",languages: ["C", "C#"]})
RETURN e

But the problem in my case is that Person node is already created, and I need to set the email property on it now.

This is a small subset of the data I have to load -

 Personid|email 
933|[email protected] 
933|[email protected]
933|[email protected] 
1129|[email protected]
1129|[email protected] 
1129|[email protected]
4194|[email protected] 
4194|[email protected]

Also, the data is coming from a CSV file with thousands of rows, so my query needs to be generic, I can't set the properties for each individual Person node.

When I was testing out the creation of the email property with this subset, my first attempt was this -

 MATCH (n:TESTPERSON{id:933})
 SET n.email = "[email protected]"
 RETURN n

 MATCH (n:TESTPERSON{id:933})
 SET n.email = "[email protected]"
 RETURN n

As I was thinking, this just overwrites the email property to the value in the most recent query.

After looking at the answers here and on the Cypher docs, I found out that Neo4J allows you to set an array/collection (multiple values of the same type) as a property value, and then I tried this -

 // CREATE test node
 CREATE (n:TESTPERSON{id:933})
 RETURN n

 // at this time, this node does not have any `email` property, so setup 
 // email as an array with one string value
 MATCH (n:TESTPERSON{id:933})
 SET n.email = ["[email protected]"]
 RETURN n


 // Now, using +=, I can append to the array of strings
 MATCH (n:TESTPERSON{id:933})
 SET n.email = n.email + "[email protected]"
 RETURN n

 // add a third value to array
 MATCH (n:TESTPERSON{id:933})
 SET n.email = n.email + "[email protected]"
 RETURN n

Here's the result - enter image description here

As you can see, the email property now has multiple values.

But the problem is that since my CSV file has thousands of rows, I need a generic query to do this.

I thought of using a CASE statement as per the documentation here, and tried this -

MATCH (n:TESTPERSON {id:933}) 
CASE 
WHEN n.email IS NULL THEN SET n.email = [ "[email protected]"] 
ELSE SET n.email = n.email + "[email protected]" 
RETURN n

But this just throws the error - mismatched input CASE expecting ;.

I was hoping I could use this query as a generic way for my CSV file like this -

LOAD CSV WITH HEADERS FROM 'FILEURL' AS line FIELDTERMINATOR `|`
MATCH (n:TESTPERSON {id:toInt(line.Personid)}) 
CASE 
WHEN n.email IS NULL THEN SET n.email = [line.email] 
ELSE SET n.email = n.email + line.email 

But I don't even know if this would work, even if the CASE error is fixed.

I'm really stumped, and would appreciate any help. Thank You.

like image 297
Manish Giri Avatar asked Nov 27 '25 23:11

Manish Giri


1 Answers

You can use COALESCE() to use a default value in case the value you're trying to get is null. You might use it like this:

... SET n.email = COALESCE(n.email, []) + "[email protected]" ...

Whenever you're setting an array of values as a node property, it's a good idea to consider whether you might instead model these as separate nodes with relationships to the original node.

In this case, :Email nodes with some relationship to your :TESTPERSON nodes, with one :Email node per email, and multiple relationships from :TESTPERSON to multiple :Emails.

An advantage here is you'd be able to support uniqueness constraints, if you want to ensure there's only one :Email in the system, and you would be able to quickly look up a person by their email if you have an index or unique constraint, as the query would use the index to lookup the :Email and from there it's only one relationship traversal to the owner of the email.

When you have values in a collection on a node, you can't use an index lookup to a value in the collection, so your current model won't be able to quickly lookup a person by their email.

like image 51
InverseFalcon Avatar answered Nov 29 '25 11:11

InverseFalcon



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!