Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cypher: analog of `sort -u` to merge 2 collections?

Tags:

neo4j

cypher

Suppose I have a node with a collection in a property, say

START x = node(17) SET x.c = [ 4, 6, 2, 3, 7, 9, 11 ];

and somewhere (i.e. from .csv file) I get another collection of values, say

c1 = [ 11, 4, 5, 8, 1, 9 ]

I'm treating my collections as just sets, order of elements does not matter. What I need is to merge x.c with c1 with come magic operation so that resulting x.c will contain only distinct elements from both. The following idea comes to mind (yet untested):

LOAD CSV FROM "file:///tmp/additives.csv" as row
START x=node(TOINT(row[0]))
MATCH c1 = [ elem IN SPLIT(row[1], ':') | TOINT(elem) ]
SET
x.c = [ newxc IN x.c + c1 WHERE (newx IN x.c AND newx IN c1) ];

This won't work, it will give an intersection but not a collection of distinct items. More RTFM gives another idea: use REDUCE() ? but how?

How to extend Cypher with a new builtin function UNIQUE() which accept collection and return collection, cleaned form duplicates?

UPD. Seems that FILTER() function is something close but intersection again :(

x.c = FILTER( newxc IN x.c + c1 WHERE (newx IN x.c AND newx IN c1) )

WBR, Andrii

like image 892
stesin Avatar asked Dec 09 '14 10:12

stesin


2 Answers

How about something like this...

with [1,2,3] as a1
, [3,4,5] as a2
with a1 + a2 as all
unwind all as a
return collect(distinct a) as unique

Add two collections and return the collection of distinct elements.

dec 15, 2014 - here is an update to my answer...

I started with a node in the neo4j database...

//create a node in the DB with a collection of values on it
create (n:Node {name:"Node 01",values:[4,6,2,3,7,9,11]}) 
return n

I created a csv sample file with two columns...

Name,Coll
"Node 01","11,4,5,8,1,9"

I created a LOAD CSV statement...

LOAD CSV 
WITH HEADERS FROM "file:///c:/Users/db/projects/coll-merge/load_csv_file.csv" as row

// find the matching node 
MATCH (x:Node) 
WHERE x.name = row.Name

// merge the collections
WITH x.values + split(row.Coll,',') AS combo, x

// process the individual values
UNWIND combo AS value

// use toInt as the values from the csv come in as string
// may be a better way around this but i am a little short on time
WITH toInt(value) AS value, x

// might as well sort 'em so they are all purdy
ORDER BY value
WITH collect(distinct value) AS values, x
SET x.values = values
like image 190
Dave Bennett Avatar answered Oct 23 '22 12:10

Dave Bennett


You could use reduce like this:

with [1,2,3] as a, [3,4,5] as b
return reduce(r = [], x in a + b | case when x in r then r else r + [x] end)
like image 2
Michael Hunger Avatar answered Oct 23 '22 12:10

Michael Hunger