Let's say I am indexing into Elasticsearch a bunch of Product
s and Store
s in which the product is available. For example, a document looks something like:
{
name: "iPhone 6s",
price: 600.0,
stores: [
{
name: "Apple Store Union Square",
location: "San Francisco, CA"
},
{
name: "Target Cupertino",
location: "Cupertino, CA"
},
{
name: "Apple Store 5th Avenue",
location: "New York, NY"
}
...
]
}
and using the nested
type, the mappings will be:
"mappings" : {
"product" : {
"properties" : {
"name" : {
"type" : "string"
},
"price" : {
"type" : "float"
},
"stores" : {
"type" : "nested",
"properties" : {
"name" : {
"type" : "string"
},
"location" : {
"type" : "string"
}
}
}
}
}
}
I want to create a query to find all the products that are available in certain location, say "CA", and then sort by the number of stores matched. I know Elasticsearch has a inner hit feature which allows me to find hits in the nested Store
documents, but is sorting Product
based on the doc_count
of the inner hit possible? And to extend the question further, is sorting the parent documents based on some inner aggregation possible? Thanks in advance.
What you are trying to achieve is possible. Currently you are not getting expected results because by default score_mode
parameter is avg
in nested query, so if 5 stores match the given product they might be scored lower than say one which matches 2 stores only because the _score
is calculated by taking average.
This problem can be solved by summing
all the inner hits
by specifying score_mode
as sum
. One minor problem could be field length norm i.e match in shorter field gets higher score than bigger field. so in your example Cupertino, CA will get bit higher score
than San Francisco, CA. You can check this behavior with inner hits
. To solve this you need to disable the field norms
. Change location mapping
to
"location": {
"type": "string",
"norms": {
"enabled": false
}
}
After that this query will give you desired results. I included inner hits
to demonstrate equal score
for every matched nested doc.
{
"query": {
"nested": {
"path": "stores",
"query": {
"match": {
"stores.location": "CA"
}
},
"score_mode": "sum",
"inner_hits": {}
}
}
}
This will sort
the products based on the number of stored matched.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With