Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Order solr documents with same score by date added descending

Tags:

solr

relevance

I want to have search results from SOLR ordered like this:

All the documents that have the same score will be ordered descending by date added.

So when I query solr I will have n documents. In this results set there will be groups of documents with the same score. I want each of this group of documents to be ordered descending by date added.

I discovered I can accomplish this using function queries, more exactly using rord function http://wiki.apache.org/solr/FunctionQuery#rord, but as it is stated in the documentation

WARNING: as of Solr 1.4, ord() and rord() can cause excess memory use since they must use a FieldCache entry at the top level reader, while sorting and function queries now use entries at the segment level. Hence sorting or using a different function query, in addition to ord()/rord() will double memory use.

it will cause excess memory use.

What other options do I have ?

I was thinking to use recip(ms(NOW,startTime),1,1,0). Is this the best approach ?

Is there any negative performance impact if I use recip and ms ?

like image 355
Dorin Avatar asked Feb 10 '12 12:02

Dorin


2 Answers

You can use multiple SORT conditions:

Multiple sort orderings can be separated by a comma, ie: sort=+[,+]...

http://wiki.apache.org/solr/CommonQueryParameters

So, in your case would be: sort=score DESC, date_added DESC

like image 52
Stelian Matei Avatar answered Nov 15 '22 03:11

Stelian Matei


Since your questions says:

All the documents that have the same score will be ordered descending by date added.

the other answer you got is perfect.

Anyway, I'd suggest you to make sure that you really want to sort by date only for document with the same score. In my experience this has always been wrong. In fact, the solr score is not absolute but just relative to other documents, and each document is different.

Therefore I wouldn't sort by score and then something else, because it's hard to predict when you'll have the same score for different documents. I would personally sort only on score and use a function to boost recent documents. You can find a good example on the solr wiki, the function used there is recip(ms(NOW,date_field),3.16e-11,1,1).

If you're worried for performance you can try index time boosting, which should be faster than query time boosting. Have a look here.

like image 41
javanna Avatar answered Nov 15 '22 01:11

javanna