I'm using solr 6.6.0 ,and here are the documents in the collection.
{"id":1,"content":test1"}
{"id":2,"content":test2"}
{"id":3,"content":test3"}
Say I wanto to include the documents not containing "test1" and "test2",It seems legal to write the query string in the following way,according to the Grouping Terms to Form Sub-Queries section of refernce guide.
content:((NOT "test1") AND (NOT "test2"))
the result of the query is to expected return only document #3,but the actual result is empty.
Alternatively,if the above query is changed to the following,without parentheses surround the "NOT expressions",the expected result is returned.
content:(NOT "test1" AND NOT "test2")
My question is,why the first query string does not work in the expected way?
Solr queries require escaping special characters that are part of the query syntax. Special characters are: +, -, &&, ||, !, (, ), ", ~, *, ?, and : . To escape these characters, use a slash ( \ ) before the character to escape.
numFound indicates the number of documents in the search index that matched your query. Solr only returns the specified number of documents in results, though. Without setting parameters, defaults are used; everything is configurable, either through the query string or in query configuration (see solrconfig.
The defType parameter selects the query parser that Solr should use to process the main query parameter ( q ) in the request. For example: defType=dismax. If no defType param is specified, then by default, the The Standard Query Parser is used. ( eg: defType=lucene )
Solr provides Query (q parameter) and Filter Query (fq parameter) for searching. The query (q parameter), as the name suggests, is the main query used for searching. Example. q = title:james. Filter queries are used alongside query (q parameter) to limit results of queries using additional filters.
Solr currently checks for a "pure negative" query and inserts *:*
(which matches all documents) so that the latter format(that without parentheses) works correctly.
See the code snippet below from org.apache.solr.search.QueryUtils.java
/** Fixes a negative query by adding a MatchAllDocs query clause.
* The query passed in *must* be a negative query.
*/
public static Query fixNegativeQuery(Query q) {
BooleanQuery newBq = (BooleanQuery)q.clone();
newBq.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
return newBq;
}
So NOT "test"
is transformed by solr into (*:* NOT "test")
But Solr only checks only the top level query,so this means that a query like (NOT "test1")
is not changed since the pure negative query is not in the top level.
This is why the former format (that with parentheses) does not work as expected.
So,we can conclude generally that the proper way of using NOT
operator is the (*:* NOT some_expression)
form ,instead of a single NOT some_expression
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With