This isn't about multi-level sorting.
I need my results first selected by distance, limited to 50, then those 50 sorted by price.
select *
from
(
select top 50 * from mytable order by distance asc)
)
order by price asc
Essentially, the second sort throws away the ordering of the inner sort - but the inner sort is used to hone in on the top 50 results.
The other answers I've seen for this sort of question looks at second-level sorting, which is not what I'm after.
BTW: I've looked at aggregations - Top N results, but I'm not sure I can apply a sort on the aggregation result sort. Also looked at rescore
, but I don't know where to put my 'sorts'
Elasticsearch comes with a good default out of the box. It sorts the results by relevance to the search query term, most relevant first. Elasticsearch measures the relevance score as a floating-point number called _score, and orders results in the descending order of their _score values.
A top hits aggregation will allow you to sort on a separate field, in your case price
from the main query sort (on distance
). See the documentation here for how to specify sorting in the top hits agg.
It'll look a little like this (which assumes distance
is a double type; if it's a geo-location type, use the documentation provided by Volodymyr Bilyachat.)
{
"sort":[
{
"distance":"asc"
}
],
"query":{
"match_all":{}
},
"size":50,
"aggs":{
"top_price_hits":{
"top_hits":{
"sort":[
{
"price":{
"order":"asc"
}
}
],
"size":50
}
}
}
}
However, if there are only 50 results that you want from your primary query, why don't you just sort in the application client side? This would be a better approach as using a top hits aggregation for a secondary sort is a slight abuse of its purpose.
The in-application approach would be more robust.
+1'ed the accepted answer, but I wanted to make sure you were aware of how search scoring, can often deliver a better user experience than traditional sorting.
Based on your current strategy, one could say:
Distance is important, relatively speaking (e.g. top 50 closest) but not in absolute terms (e.g. must be within 50mi).
You only want to show 50 results.
You want those results to be sorted by price (or perhaps alphabetically).
However, if you find yourself trying to generalize about which result a searcher is most likely to choose, you may discover a function of price and distance (or other features) which better models the real-world likelihood of a searcher choosing a particular result.
E.g. Say you discover that
Then you could model a sample scoring function that generates a result ordering based on this relationship.
E.g. 1/price + 1/distance
... which would generate a higher score as either price or distance decreased.
Which could be generalized to P * 1/price + 1/distance
where P represented a tuning coefficient expressing the relative importance of price vs distance.
Armed with this model, you could then write a function score query which would output ordered results with the optimal combinations of price and distance for your users.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With