Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group results in elasticsearch?

I am storing Book Titles in elasticsearch and they all belong to many shops. Like this:

{
    "books": [
        {
            "id": 1,
            "title": "Title 1",
            "store": "store1" 
        },
        {             
            "id": 2,
            "title": "Title 1",
            "store": "store2" 
        },
        {             
            "id": 3,
            "title": "Title 1",
            "store": "store3" 
        },
        {             
            "id": 4,
            "title": "Title 2",
            "store": "store2" 
        },
        {             
            "id": 5,
            "title": "Title 2",
            "store": "store3" 
        }
    ]
}

How can I get all the books and group them by title... and one result per group (one row with group with the same title so i can get all ids and stores)?

Based on data above I want to get two results with all ids and stores in them.

Expected results:

{
"hits":{
    "total" : 2,
    "hits" : [
        {                
            "0" : {
                "title" : "Title 1",
                "group": [
                     {
                         "id": 1,
                         "store": "store1"
                     },
                     {
                         "id": 2,
                         "store": "store2"
                     },
                     {
                         "id": 3,
                         "store": "store3"
                     },
                ]
            }
        },
        {                
            "1" : {
                "title" : "Title 2",
                "group": [
                     {
                         "id": 4,
                         "store": "store2"
                     },
                     {
                         "id": 5,
                         "store": "store3"
                     }
                ]
            }
        }
    ]
}
}
like image 657
TroodoN-Mike Avatar asked Apr 15 '14 14:04

TroodoN-Mike


2 Answers

What you are looking for is not possible in Elasticsearch, at least not with the current version (1.1).

There is a long outstanding issue for this feature with a lot of +1's and demand behind it.

As for statements: Simon says, it requires a lot of refactoring and although it is planned, there is no way of saying, when it will be implemented or even shipped.

A similar statement was made by Clinton Gormley in his webinar, that field grouping needs a lot of effort to be done right, especially since Elasticsearch is a sharded and distributed environment by nature. It would be not that big of a deal, if you'd ignore sharding, but Elasticsearch wants to ship only with features, that can scale with the complete system and work as well on hundreds of machines as they would on a single box.

If you're not tied to Elasticsearch, Solr offers such a feature.

Otherwise, probably the best solution at the moment is to do this client side. That is, query for some documents, do the grouping on you client and if needed, fetch some more results to satisfy your desired group size (as far as i know, this is what Solr is doing under the hood).

Not exactly what you wanted, but you could also go for aggregations; create one bucket for your title and have a sub-aggregation done on the id field. You won't get the store values with this, but you could retrieve them from your datastore once you have the ids.

{
    "aggs" : {
        "titles" : {
            "terms" : { "field" : "title" },
            "aggs": {
                "ids": {
                    "terms": { "field" : "id" }
                }
            }
        }
    }
}

Edit: It seems, that with the top_hits aggregations, result grouping could be implemented soon.

like image 142
knutwalker Avatar answered Sep 28 '22 14:09

knutwalker


You can implement above desired result using Aggregation in aggregation with top_hits aggs. ex.

aggs: {
        "set": {
            "terms": {
                field: "id"
            },
            "aggs": {
                "color": {
                    "terms": {
                        field: "color"
                    },
                    "aggs": {
                        "products": {
                            "top_hits": {
                                _source:{
                                    "include":["size"]
                                }
                            }
                        }
                    }
                },
                "product": {
                    "top_hits": {
                        _source:{
                            "include":["productDetails"]
                        },
                        size: 1
                    }
                }
            }
        }
    }
like image 21
Prasad Bhosale Avatar answered Sep 28 '22 16:09

Prasad Bhosale