I'm trying to match nodes in a Neo4j database. The nodes have a property called "name" and I'm using regular expression in Cypher to match this. I only want to match whole words, so "javascript" should not match if I supply the string "java". If the string to match is of several words, i.e. "java script" I will do two seperate queries, one for "java" and one for "script".
This is what I have so far:
match (n) where n.name =~ '(?i).*\\bMYSTRING\\b.*' return n
This works, but it does not work with some special characters like "+" or "#". So I cant search for "C++" or "C#" etc. The regular expression in the above code is just using \b for word boundary. it is also escaping it so it works correctly.
I tried some versions of this post: regex to match word boundary beginning with special characters but it didnt really work, maybe I did something wrong.
How can I make this work with special characters in Cypher and Neo4j?
Try escaping the special characters and look for non-word characters rather than word boundaries. For example;
match (n) where n.name =~ '(?i).*(?:\\W|^)C\\+\\+(?:\\W|$).*' return n
Although this still has some false positives, for example the above will match "c+++".
For "Non word character, except that we want to treat + as a word character" the following could work.
match (n) where n.name =~ '(?i).*(?:[\\W-[+]]|^)C\\+\\+(?:[\\W-[+]]|$).*' return n
Although this is not supported by all regexp flavors, and I am not sure if Neo4j supports this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With