Is there a way to have elasticsearch return a hit per generated bucket during an aggregation?

Tags:

elasticsearch

right now I have a query like this:

{
    "query": {
        "bool": {
            "must": [
                {
                    "match": {
                        "uuid": "xxxxxxx-xxxx-xxxx-xxxxx-xxxxxxxxxxxxx"
                    }
                },
                {
                    "range": {
                        "date": {
                            "from": "now-12h",
                            "to": "now"
                        }
                    }
                }
            ]
        }
    },
    "aggs": {
        "query": {
            "terms": [
                {
                    "field": "query",
                    "size": 3
                }
            ]
        }
    }
}

The aggregation works perfectly well, but I can't seem to find a way to control the hit data that is returned, I can use the size parameter at the top of the dsl, but the hits that are returned are not returned in the same order as the bucket so the bucket results do not line up with the hit results. Is there any way to correct this or do I have to issue 2 separate queries?

499

asked Mar 13 '14 05:03

AgentRegEdit

3 Answers

To expand on Filipe's answer, it seems like the top_hits aggregation is what you are looking for, e.g.

{
  "query": {
    ... snip ...
  },
  "aggs": {
    "query": {
      "terms": {
        "field": "query",
        "size": 3
      },
      "aggs": {
        "top": {
          "top_hits": {
            "size": 42
          }
        }
      }
    }
  }
}

158

answered Sep 26 '22 05:09

Shadocko

Your query uses exact matches (match and range) and binary logic (must, bool) and thus should probably be converted to use filters instead:

"filtered": {
 "filter": {
    "bool": {
       "must": [
          {
             "term": {
                "uuid": "xxxxxxx-xxxx-xxxx-xxxxx-xxxxxxxxxxxxx"
             }
          },
          {
             "range": {
                "date": {
                   "from": "now-12h",
                   "to": "now"
                }
             }
          }
       ]
    }
 }

As for the aggregations,

The hits that are returned do not represent all the buckets that were returned. so if have buckets for terms 'a', 'b', and 'c' I want to have hits that represent those buckets as well

Perhaps you are looking to control the scope of the buckets? You can make an aggregation bucket global so that it will not be influenced by the query or filter.

Keep in mind that Elasticsearch will not "group" hits in any way -- it is always a flat list ordered according to score and additional sorting options.

Aggregations can be organized in a nested structure and return computed or extracted values, in a specific order. In the case of terms aggregation, it is in descending count (highest number of hits first). The hits section of the response is never influenced by your choice of aggregations. Similarly, you cannot find hits in the aggregation sections.

If your goal is to group documents by a certain field, yes, you will need to run multiple queries in the current Elasticsearch release.

answered Sep 24 '22 05:09

BenG

I'm not 100% sure, but I think there's no way to do that in the current version of Elasticsearch (1.2.x). The good news is that there will be when version 1.3.x gets released:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-top-hits-aggregation.html

answered Sep 24 '22 05:09

Filipe

Related questions
                            
                                Elasticsearch: could not find java in bundled jdk at .../jdk/bin/java
                            
                                Re-setting Logstash state
                            
                                does elasticsearch have compound indexes?
                            
                                Parse multiline JSON with grok in logstash
                            
                                How to have an input of type MongoDB for Logstash
                            
                                Kibana4 to listen on Port 80 instead of Port 5601
                            
                                How to enable inline (sandboxed) groovy scripts?
                            
                                Recreation of mapping elastic search
                            
                                Setting up Id field in Elastic Search / NEST 2.0 via fluent Api
                            
                                Aggregating with multiple fields returned in ElasticSearch
                            
                                ElasticSearch not running on localhost:9200 after apt-get install
                            
                                How to Properly Close Raw RestClient When Using Elastic Search 5.5.0 for Optimal Performance?
                            
                                Why is Elasticsearch starting manually but not starting as a service on Ubuntu 16.04?
                            
                                Are IDs guaranteed to be unique across indices in Elasticsearch 6+?
                            
                                What is the right way to snyc/import tables from a postgres DB to elasticsearch?
                            
                                Elasticsearch fails to start: CONFIG_SECCOMP and CONFIG_SECCOMP_FILTER are needed
                            
                                How can I make a cumulative sum graph in grafana, from an elasticsearch data source?
                            
                                Elasticsearch with snowball analyzer only returns results for stemmed word
                            
                                Elasticsearch Update API if a field does not exist
                            
                                Logstash converting date to valid joda time (@timestamp)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With