Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple queries in Solr

My problem is I have n fields (say around 10) in Solr that are searchable, they all are indexed and stored. I would like to run a query first on my whole index of say 5000 docs which will hit around an average of 500 docs. Next I would like to query using a different set of keywords on these 500 docs and NOT on the whole index.

So the first time I send a query a score will be generated, the second time I run a query the new score generated should be based on the 500 documents of the previous query, or in other words Solr should consider only these 500 docs as the whole index.

To summarise this, Index of 5000 will be filtered to 500 and then 50 (5000>500>50). Its basically filtering but I would like to do this in Solr.

I have reasonable basic knowledge and still learning.

Update: If represented mathematically it would look like this:

results1=f(query1)
results2=f(query2, results1)
final_results=f(query3, results2)

I would like this to be accomplish using a program and end-user will only see 50 results. So faceting is not an option.

like image 678
user2575429 Avatar asked Jul 12 '13 07:07

user2575429


2 Answers

Two likely implementations occur to me. The simplest approach would be to just add the first query to the second query;

+(first query) +(new query)

This is a good approach if the first query, which you want to filter on, changes often. If the first query is something like a category of documents, or something similar where you can benefit from reuse of the same filter, then a filter query is the better approach, using the fq parameter, something like:

q=field:query2&fq=categoryField:query1

filter queries cache a set of document ids to filter against, so for commonly used searches, like categories, common date ranges, etc., a significant performance benefit can be gained from it (for uncommon searches, or user-entered search strings, it may just incur needless overhead to cache the results, and pollute the cache with a useless result set)

like image 58
femtoRgon Avatar answered Sep 24 '22 10:09

femtoRgon


Filter queries (fq) are specifically designed to do quick restriction of the result set by not doing any score calculation.

So, if you put your first query into fq parameter and your second score-generating query in the normal 'q' parameter, it should do what you ask for.

See also a question discussing this issue from the opposite direction.

like image 39
Alexandre Rafalovitch Avatar answered Sep 24 '22 10:09

Alexandre Rafalovitch