Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Negative operator(NOT,- , !) in solr query string doesn't work with parentheses

I'm using solr 6.6.0 ,and here are the documents in the collection.

{"id":1,"content":test1"}
{"id":2,"content":test2"}
{"id":3,"content":test3"}

Say I wanto to include the documents not containing "test1" and "test2",It seems legal to write the query string in the following way,according to the Grouping Terms to Form Sub-Queries section of refernce guide.

content:((NOT "test1") AND (NOT "test2"))

the result of the query is to expected return only document #3,but the actual result is empty.

Alternatively,if the above query is changed to the following,without parentheses surround the "NOT expressions",the expected result is returned.

content:(NOT "test1" AND NOT "test2")

My question is,why the first query string does not work in the expected way?

like image 445
chao_chang Avatar asked Nov 28 '17 03:11

chao_chang


People also ask

How do you escape special characters in Solr?

Solr queries require escaping special characters that are part of the query syntax. Special characters are: +, -, &&, ||, !, (, ), ", ~, *, ?, and : . To escape these characters, use a slash ( \ ) before the character to escape.

What is numFound in Solr?

numFound indicates the number of documents in the search index that matched your query. Solr only returns the specified number of documents in results, though. Without setting parameters, defaults are used; everything is configurable, either through the query string or in query configuration (see solrconfig.

What is defType in Solr?

The defType parameter selects the query parser that Solr should use to process the main query parameter ( q ) in the request. For example: defType=dismax. If no defType param is specified, then by default, the The Standard Query Parser is used. ( eg: defType=lucene )

What is Q in Solr query?

Solr provides Query (q parameter) and Filter Query (fq parameter) for searching. The query (q parameter), as the name suggests, is the main query used for searching. Example. q = title:james. Filter queries are used alongside query (q parameter) to limit results of queries using additional filters.


1 Answers

Solr currently checks for a "pure negative" query and inserts *:* (which matches all documents) so that the latter format(that without parentheses) works correctly.

See the code snippet below from org.apache.solr.search.QueryUtils.java

/** Fixes a negative query by adding a MatchAllDocs query clause.
  * The query passed in *must* be a negative query.
  */
 public static Query fixNegativeQuery(Query q) {
   BooleanQuery newBq = (BooleanQuery)q.clone();
   newBq.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
   return newBq;
 }

So NOT "test" is transformed by solr into (*:* NOT "test")

But Solr only checks only the top level query,so this means that a query like (NOT "test1") is not changed since the pure negative query is not in the top level. This is why the former format (that with parentheses) does not work as expected.

So,we can conclude generally that the proper way of using NOT operator is the (*:* NOT some_expression) form ,instead of a single NOT some_expression.

like image 57
chao_chang Avatar answered Sep 21 '22 04:09

chao_chang