Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which special characters need escaping in a solr query?

Update: I think this question has to do with solr syntax in general, and not Chef in particular. So while I ran into this working with Chef, I presume that anyone working with Solr will also experience this...


I'm working on an application that communicates with the Chef server's search API to find particular nodes.

Based on this http://docs.opscode.com/essentials_search.html#special-characters, it seems that a number of special characters need to be escaped.

Note: I'm only concerned with exact-matching patterns, not wildcards. I realize that the reason some of these characters are wildcards.

Here's the list at the time of this writing, as copied from the URL above:

+  -  &&  | |  !  ( )  { }  [ ]  ^  "  ~  *  ?  :  \ 

When I try various knife search commands with these characters, however, I see inconsistent behaviour.

For the following examples, I set up a node that is tagged with +&|!(){}[]^\"~*?:\\"

These commands were run from a Linux box, in a bash shell:

$ knife search node 'tags:+&|!(){}[]^"~*?:\' ERROR: knife search failed: invalid search query: 'tags:+&|!(){}[]^"~*?:\' 

That behaved as expected, since nothing was escaped. Now, I escape everything with a single \ as the docs suggest:

$ knife search node 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\' ERROR: knife search failed: invalid search query: 'tags:\+\&\|\!\(\)\{\}\[\]\^\"\~\*\?\:\\' 

Strange.

Can anyone shed some light on this, and maybe suggest a query that's capable of matching that tag?

It's obviously unlikely that anyone will ever have an attribute containing all those special characters, but I'd like to understand better how the special characters should be escaped.

Thanks!

like image 959
hairyhenderson Avatar asked Feb 20 '14 17:02

hairyhenderson


People also ask

What is Q in Solr query?

Solr provides Query (q parameter) and Filter Query (fq parameter) for searching. The query (q parameter), as the name suggests, is the main query used for searching. Example. q = title:james. Filter queries are used alongside query (q parameter) to limit results of queries using additional filters.

What is the default return type of Solr request?

The default value is 0 . In other words, by default, Solr returns results without an offset, beginning where the results themselves begin.

How do you directly query Solr?

You can search for "solr" by loading the Admin UI Query tab, enter "solr" in the q param (replacing *:* , which matches all documents), and "Execute Query". See the Searching section below for more information. To index your own data, re-run the directory indexing command pointed to your own directory of documents.

What is eDisMax in Solr?

The Extended DisMax (eDisMax) query parser is an improved version of the DisMax query parser. In addition to supporting all the DisMax query parser parameters, Extended Dismax: supports Solr's standard query parser syntax such as (non-exhaustive list): boolean operators such as AND (+, &&), OR (||), NOT (-).


2 Answers

You need to use the lucene solr syntax for regexes: http://lucene.apache.org/core/6_5_1/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters

like image 114
sethvargo Avatar answered Nov 11 '22 07:11

sethvargo


It might be a good idea looking at http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/util/ClientUtils.html#escapeQueryChars(java.lang.String)

like image 22
Anatoli Radoulov Avatar answered Nov 11 '22 05:11

Anatoli Radoulov