Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can you boost documents by recency in RavenDB?

Tags:

lucene

ravendb

Is it possible to boost recent documents in a RavenDB query?

This question is exactly what I want to do but refers to native Lucene, not RavenDB.

For example, if I have a Document like this

public class Document
{
    public string Title { get; set; }
    public DateTime DateCreated  { get; set; }
}

How can I boost documents who's date are closer to a given date, e.g. DateTime.UtcNow?

I do not want to OrderByDecending(x => x.DateCreated) as there are other search parameters that need to affect the results.

like image 522
Greg B Avatar asked Dec 13 '12 16:12

Greg B


1 Answers

You can boost during indexing, it's been in RavenDB for quite some time, but it's not in the documentation at all. However, there are some unit tests that illustrate here.

Those tests show a single boost value, but it can easily be calculated from other document values instead. You have the full document available to you since this is done when the index entries are written. You should be able to combine this with the technique described in the post you referenced.

Map = docs => from doc in docs
              select new
              {
                  Title = doc.Title.Boost(doc.DateCreated.Ticks / 1000000f)
              };

You could also boost the entire document instead of just the Title field, which might be useful if you have other fields in your search algorithm:

Map = docs => from doc in docs
              select new
              {
                  doc.Title
              }.Boost(doc.DateCreated.Ticks / 1000000f);

You may need to experiment with the right value to use for the boost amount. There are 10,000 ticks in a millisecond, so that's why i divide by such a large number.

Also, be careful that the DateTime you're working with is in UTC, or if you don't have control over where it comes from, then use a DateTimeOffset instead. Why? Because you're using a calculated duration from some reference point and you don't want the result to be ambiguous for different time zones or around daylight savings time changes.

like image 149
Matt Johnson-Pint Avatar answered Oct 11 '22 01:10

Matt Johnson-Pint