Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr: Filtering on the number of matches in an OR query to a multivalued field

Tags:

solr

Given the following example solr documents:

<doc>
  <field name="guid">1</field>
  <field name="name">Harry Potter</field>
  <field name="friends">ron</field>
  <field name="friends">hermione</field>
  <field name="friends">ginny</field>
  <field name="friends">dumbledore</field>
</doc>
<doc>
  <field name="guid">2</field>
  <field name="name">Ron Weasley</field>
  <field name="friends">harry</field>
  <field name="friends">hermione</field>
  <field name="friends">lavender</field>
</doc>
<doc>
  <field name="guid">3</field>
  <field name="name">Hermione Granger</field>
  <field name="friends">harry</field>
  <field name="friends">ron</field>
  <field name="friends">ginny</field>
  <field name="friends">dumbledore</field>
</doc>

and the following query (or filter query):

friends:ron OR friends:hermione OR friends:ginny OR friends:dumbledore 

all three documents will be returned since they each have at least one of the specified friends.

However, I'd like to set a minimum (and maximum) threshold for how many friends are matched. For example, only return documents that have at least 2 but no more than 3 of the specified friends.

Such a query would only return the third document (Hermione Granger) as she has 3 of the 4 friends specified, while the first (Harry Potter) matches all 4 and the second (Ron Weasley) matches only 1.

Is this possible in a Solr query?

like image 723
jiffybank Avatar asked May 10 '13 18:05

jiffybank


People also ask

What is the difference between Q and FQ in solr?

Standard solr queries use the "q" parameter in a request. Filter queries use the "fq" parameter. The primary difference is that filtered queries do not affect relevance scores; the query functions purely as a filter (docset intersection, essentially).

What is FQ in solr query?

The fq (Filter Query) ParameterThe fq parameter defines a query that can be used to restrict the superset of documents that can be returned, without influencing score. It can be very useful for speeding up complex queries, since the queries specified with fq are cached independently of the main query.

What is WT in solr?

Solr supports a variety of Response Writers to ensure that query responses can be parsed by the appropriate language or application. The wt parameter selects the Response Writer to be used.


1 Answers

You'll want to use a function query, termfreq, and count the number of terms (aka "friends" in your case) matched. You can sum up the results, then only return documents within your threshold, using frange, like this:

{!frange l=2 u=3}sum(termfreq(friends,'ron'),termfreq(friends,'hermione'),termfreq(friends,'ginny'),termfreq(friends,'dumbledore'))

termfreq(...) will return 1 for each friend found, and the sum of those is what you test against your threshold (the lower and upper bounds you specified in the beginning of your !frange statement).

You can place this in the q: field or fq: field. Here it is in the Solr admin panel for your reference:

enter image description here

like image 59
jbnunn Avatar answered Nov 03 '22 00:11

jbnunn