Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch sorting on string not returning expected results

When sorting on a string field with multiple words, Elasticsearch is splitting the string value and using the min or max as the sort value. I.E.: when sorting on a field with the value "Eye of the Tiger" in ascending order, the sort value is: "Eye" and when sorting in descending order the value is: "Tiger".

Lets say I have "Eye of the Tiger" and "Wheel of Death" as entries in my index, when I do an ascending sort on this field, I would expect, "Eye of the Tiger" to be first, since "E" comes before "W", but what I'm seeing when sorting on this field, "Wheel of Death" is coming up first, since "D" is the min value of that term and "E" is the min value of "Eye of the Tiger".

Does anyone know how to turn off this behavior and just allow a regular sort on this string field?

like image 550
willz Avatar asked Jan 27 '14 19:01

willz


People also ask

How do I sort data in Elasticsearch?

Sort mode optioneditPick the highest value. Use the sum of all values as sort value. Only applicable for number based array fields. Use the average of all values as sort value.

What is the difference between text and keyword in Elasticsearch?

The primary difference between the text datatype and the keyword datatype is that text fields are analyzed at the time of indexing, and keyword fields are not. What that means is, text fields are broken down into their individual terms at indexing to allow for partial matching, while keyword fields are indexed as is.

How Elasticsearch sorting works?

In order to sort by relevance, we need to represent relevance as a value. In Elasticsearch, the relevance score is represented by the floating-point number returned in the search results as the _score, so the default sort order is _score descending.


1 Answers

As mconlin mentioned if you want to sort on the unanalyzed doc field you need to specify "index": "not_analyzed" to sort as you described. But if you're looking to be able to keep this field tokenized to search on, this post by sloan shows a great example. Using multi-field to keep two different mappings for a field is very common in Elasticsearch.

Hope this helps, let me know if I can offer more explanation.

like image 75
Michael at qbox.io Avatar answered Oct 02 '22 13:10

Michael at qbox.io