Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cypher query with regular expression

I'm trying to match nodes in a Neo4j database. The nodes have a property called "name" and I'm using regular expression in Cypher to match this. I only want to match whole words, so "javascript" should not match if I supply the string "java". If the string to match is of several words, i.e. "java script" I will do two seperate queries, one for "java" and one for "script".

This is what I have so far:

match (n) where n.name =~ '(?i).*\\bMYSTRING\\b.*' return n

This works, but it does not work with some special characters like "+" or "#". So I cant search for "C++" or "C#" etc. The regular expression in the above code is just using \b for word boundary. it is also escaping it so it works correctly.

I tried some versions of this post: regex to match word boundary beginning with special characters but it didnt really work, maybe I did something wrong.

How can I make this work with special characters in Cypher and Neo4j?

like image 272
Øyvind Avatar asked Sep 18 '14 09:09

Øyvind


1 Answers

Try escaping the special characters and look for non-word characters rather than word boundaries. For example;

match (n) where n.name =~ '(?i).*(?:\\W|^)C\\+\\+(?:\\W|$).*' return n

Although this still has some false positives, for example the above will match "c+++".

For "Non word character, except that we want to treat + as a word character" the following could work.

match (n) where n.name =~ '(?i).*(?:[\\W-[+]]|^)C\\+\\+(?:[\\W-[+]]|$).*' return n

Although this is not supported by all regexp flavors, and I am not sure if Neo4j supports this.

like image 161
Taemyr Avatar answered Oct 03 '22 14:10

Taemyr