Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Script-based sorting on Elasticsearch date field

I am just getting started with Elasticsearch and would like to use script-based sorting on a field that is mapped as date, format hour_minute. There can be multiple instances of the field in each document.

Before introducing expressions, as a first step I'm trying a simple sort (using the Sense plugin):

POST myIndex/_search
{
   "query": {
      "match_all": {}
   },
   "sort": {
      "_script": {
         "script": "doc[\"someTime\"].value",
         "lang": "groovy",
         "type": "date",
         "order": "asc"
      }
   }
}

I get this error (fragment):

SearchPhaseExecutionException[Failed to execute phase [query], all shards failed;
shardFailures {[tjWL-zV5QXmGjNlXzLvrzw][myIndex][0]:
SearchParseException[[myIndex][0]: 
query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to parse source…

If I post the above query with "type": "number" there is no error, although this of course doesn't sort by date. The following works fine:

POST myIndex/_search
{
   "query": {
      "match_all": {}
   },
   "sort": {
      "someTime": {
         "order": "asc"
      }
   }
}

Ultimately I'd like to use script-based sorting since I will be trying to query, filter or sort using date and time conditions, like query for documents with today’s date, then sort them by the lowest time that is after the time now, etc.

Any suggestions would be much appreciated.

like image 590
Michael Mueller Avatar asked Oct 04 '15 11:10

Michael Mueller


1 Answers

Using scripts to sort documents is not really performant, especially if your document base is expected to grow over time. So I'm going to offer a solution for doing that and then suggest another option.

In order to sort using script, You need to transform your date into milliseconds so your sort can be run on a simple number (sort type can only be number or string).

POST myIndex/_search
{
   "query": {
      "match_all": {}
   },
   "sort": {
      "_script": {
         "script": "doc[\"someTime\"].date.getMillisOfDay()",
         "lang": "groovy",
         "type": "number",       <----- make sure this is number
         "order": "asc"
      }
   }
}

Note that depending on the granularity you want, you can also use getSecondOfDay() or getMinuteOfDay(). That way, provided your queries and filters have selected documents for the right day, your sort script will sort documents based on the number of milliseconds (or seconds or minutes) within that day.

The second solution would imply to also index the number of milliseconds (or seconds or minutes) since the beginning of that day into another field and simply use it to sort, so that you don't need script. The bottom line is that whatever information you need at search time that can be known at index time should be indexed instead of computed in real-time.

For instance, if your someTime field contains the date 2015-10-05T05:34:12.276Z then you'd index the millisOfDay field with the value 20052276, which is

  • 5 hours * 3600000 ms
  • +34 minutes * 60000 ms
  • +12 seconds * 1000 ms
  • +276 ms

Then you can sort using

POST myIndex/_search
{
   "query": {
      "range": {
          "someTime": {
              "gt": "now"
          }
      }
   },
   "sort": {
      "millisOfDay": {
         "order": "asc"
      }
   }
}

Note that I've added a query to select only the documents whose someTime date is after now, so you'll get all documents in the future, but sorted by ascending millisOfDay, which means you'll get the nearest date from now first.

UPDATE

If someTime has the format HH:mm, then you can also store its millisOfDay value, e.g. if someTime = 17:30 then millisOfDay would be (17h * 3600000 ms) + (30 min * 60000 ms) = 63000000

Then, your query needs to be reworked a little bit using a script filter, like this:

{
  "query": {
    "filtered": {
      "filter": {
        "script": {
          "script": "doc.millisOfDay.value > new DateTime().millisOfDay"
        }
      }
    }
  },
  "sort": {
    "millisOfDay": {
      "order": "asc"
    }
  }
}
like image 138
Val Avatar answered Oct 01 '22 17:10

Val