Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cypher query to search a phrase in all properties

Tags:

neo4j

cypher

I've encountered an interesting problem while trying to implement search in my Neo4j DB. I want to search a certain phrase (allowing also partial matches) in any of the node's properties. This has to be generic and work for all node types and labels, so I can't have a pre-defined list of properties to search.

In order to understand the problem, consider the famous Movie DB tutorial that comes bundled in Neo4j browser (:play movie graph): Let's say I want to search nodes with label Movie that have a property that starts with 'The'. My first idea was:

match (m:Movie)
where (any(prop in keys(m) where m[prop] starts with "The")) 
return m

This of course throws an error because one of the properties is a number and not a string. Using toString won't help me, because in my DB some of the properties are boolean, and booleans don't respond to toString.

My next attempt was with regex, which is also better for search because I can make it case insensitive and more robust in general. So I did this:

match (m:Movie)
where (any(prop in keys(m) where m[prop] =~ "(?i)The .*"))
return m

And it worked! I got all the movies that either their title or tagline start with 'The'. And there was much rejoicing.

But now comes the tricky part. My search also needs to provide negation of the search, i.e. all movies that don't have any property that starts with 'The'. I obviously tried:

match (m:Movie)
where NOT (any(prop in keys(m) where m[prop] =~ "(?i)The .*"))
return m

But this query returned an empty response. No error, just no results.

When trying to isolate the problem, I've realized that the query does work in the following cases:

  1. If a node only has string properties (no numbers or booleans).
  2. If I use exact matches instead of regex (where NOT(any(prop in keys(m) where m[prop] = "Hoffa"))).
  3. If I search specific properties (where NOT(any(prop in ['title','tagline'] where m[prop] =~ "(?i)The .*")))

It seems that only the combination of not, any, and regex breaks the query, and I'm lost in finding out why this happens.

like image 428
Yaron Schwimmer Avatar asked May 19 '16 15:05

Yaron Schwimmer


People also ask

What Cypher clause allows you to specify a database for a query?

Reading clauses Specify the patterns to search for in the database. Specify the patterns to search for in the database while using nulls for missing parts of the pattern.

Is used in Cypher query language to combine the results from multiple queries?

The UNION clause is used to combine the result of multiple queries.


2 Answers

At least in neo4j 3.0, the STARTS WITH syntax seems to work better for your scenario (but it is case sensitive):

MATCH (m:Movie)
WHERE NONE(prop in keys(m) where TOSTRING(m[prop]) STARTS WITH "The ")
RETURN m;
like image 86
cybersam Avatar answered Oct 13 '22 04:10

cybersam


Instead of using

NOT (ANY

try

(NONE
like image 1
Tim Kuehn Avatar answered Oct 13 '22 04:10

Tim Kuehn