I am just getting started with Elasticsearch and would like to use script-based sorting on a field that is mapped as date
, format hour_minute
. There can be multiple instances of the field in each document.
Before introducing expressions, as a first step I'm trying a simple sort (using the Sense plugin):
POST myIndex/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"script": "doc[\"someTime\"].value",
"lang": "groovy",
"type": "date",
"order": "asc"
}
}
}
I get this error (fragment):
SearchPhaseExecutionException[Failed to execute phase [query], all shards failed;
shardFailures {[tjWL-zV5QXmGjNlXzLvrzw][myIndex][0]:
SearchParseException[[myIndex][0]:
query[ConstantScore(*:*)],from[-1],size[-1]: Parse Failure [Failed to parse source…
If I post the above query with "type": "number"
there is no error, although this of course doesn't sort by date. The following works fine:
POST myIndex/_search
{
"query": {
"match_all": {}
},
"sort": {
"someTime": {
"order": "asc"
}
}
}
Ultimately I'd like to use script-based sorting since I will be trying to query, filter or sort using date and time conditions, like query for documents with today’s date, then sort them by the lowest time that is after the time now, etc.
Any suggestions would be much appreciated.
Using scripts to sort documents is not really performant, especially if your document base is expected to grow over time. So I'm going to offer a solution for doing that and then suggest another option.
In order to sort using script, You need to transform your date into milliseconds so your sort can be run on a simple number (sort type can only be number
or string
).
POST myIndex/_search
{
"query": {
"match_all": {}
},
"sort": {
"_script": {
"script": "doc[\"someTime\"].date.getMillisOfDay()",
"lang": "groovy",
"type": "number", <----- make sure this is number
"order": "asc"
}
}
}
Note that depending on the granularity you want, you can also use getSecondOfDay()
or getMinuteOfDay()
. That way, provided your queries and filters have selected documents for the right day, your sort script will sort documents based on the number of milliseconds (or seconds or minutes) within that day.
The second solution would imply to also index the number of milliseconds (or seconds or minutes) since the beginning of that day into another field and simply use it to sort, so that you don't need script. The bottom line is that whatever information you need at search time that can be known at index time should be indexed instead of computed in real-time.
For instance, if your someTime
field contains the date 2015-10-05T05:34:12.276Z
then you'd index the millisOfDay
field with the value 20052276
, which is
Then you can sort using
POST myIndex/_search
{
"query": {
"range": {
"someTime": {
"gt": "now"
}
}
},
"sort": {
"millisOfDay": {
"order": "asc"
}
}
}
Note that I've added a query to select only the documents whose someTime
date is after now, so you'll get all documents in the future, but sorted by ascending millisOfDay
, which means you'll get the nearest date from now
first.
UPDATE
If someTime
has the format HH:mm
, then you can also store its millisOfDay
value, e.g. if someTime = 17:30
then millisOfDay
would be (17h * 3600000 ms) + (30 min * 60000 ms) = 63000000
Then, your query needs to be reworked a little bit using a script
filter, like this:
{
"query": {
"filtered": {
"filter": {
"script": {
"script": "doc.millisOfDay.value > new DateTime().millisOfDay"
}
}
}
},
"sort": {
"millisOfDay": {
"order": "asc"
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With