I am trying to aggregate over dynamic fields (different for different documents) via elasticsearch. Documents are like following:
[{
"name": "galaxy note",
"price": 123,
"attributes": {
"type": "phone",
"weight": "140gm"
}
},{
"name": "shirt",
"price": 123,
"attributes": {
"type": "clothing",
"size": "m"
}
}]
As you can see attributes change across documents. What Im trying to achieve is to aggregate fields of these attributes, like so:
{
aggregations: {
types: {
buckets: [{key: 'phone', count: 123}, {key: 'clothing', count: 12}]
}
}
}
I am trying aggregation feature of elasticsearch to achieve this, but not able to find correct way. Is it possible to achieve via aggregation ? Or should I start looking in to facets, thought it seem to be depricated.
Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.
You can disable dynamic mapping, both at the document and at the object level. Setting the dynamic parameter to false ignores new fields, and strict rejects the document if Elasticsearch encounters an unknown field. Use the update mapping API to update the dynamic setting on existing fields.
Elasticsearch organizes aggregations into three categories: Metric aggregations that calculate metrics, such as a sum or average, from field values. Bucket aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria.
You have to define attributes as nested in your mapping and change the layout of the single attribute values to the fixed layout { key: DynamicKey, value: DynamicValue }
PUT /catalog
{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"article": {
"properties": {
"name": {
"type" : "string",
"index" : "not_analyzed"
},
"price": {
"type" : "integer"
},
"attributes": {
"type": "nested",
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
}
}
You may than index your articles like this
POST /catalog/article
{
"name": "shirt",
"price": 123,
"attributes": [
{ "key": "type", "value": "clothing"},
{ "key": "size", "value": "m"}
]
}
POST /catalog/article
{
"name": "galaxy note",
"price": 123,
"attributes": [
{ "key": "type", "value": "phone"},
{ "key": "weight", "value": "140gm"}
]
}
After all you are then able to aggregate over the nested attributes
GET /catalog/_search
{
"query":{
"match_all":{}
},
"aggs": {
"attributes": {
"nested": {
"path": "attributes"
},
"aggs": {
"key": {
"terms": {
"field": "attributes.key"
},
"aggs": {
"value": {
"terms": {
"field": "attributes.value"
}
}
}
}
}
}
}
}
Which then gives you the information you requested in a slightly different form
[...]
"buckets": [
{
"key": "type",
"doc_count": 2,
"value": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "clothing",
"doc_count": 1
}, {
"key": "phone",
"doc_count": 1
}
]
}
},
[...]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With