Filtering out top_hits subaggregation based on total documents

Question

I'm doing map clustering using Elasticsearch GeoHash grid aggregation. The query returns on average 100-200 buckets. Each of the bucket uses the top_hits aggregation which I use to return 3 documents for each aggregated cluster.

The problem is that I want to return top_hits only when the parent aggregation (GeoHash) aggregates no more than 3 documents.

If a cluster aggregates more than 3 documents I don't want ES to return any documents for this cluster (because I'm not gonna use them).

I've tried to use Bucket Selector Aggregation, but didn't manage to construct a correct bucket_path. I use bucket selector aggregation on the same level as top_hits aggregation. The number of total documents for a bucket is available at top_hits.hits.total but what I'm getting is reason=path not supported for [top_hits]: [hits.total].

Is this possible in elasticsearch? It's important for me, because in most of the queries only small percentage of buckets will have less than 3 documents. But top hits subaggregation is always returning top 3 documents even for clusters of 1000 documents. If a result of a query return 200 buckets and only 5 of them are aggregating <= 3 documents so I want to return only 5*3 documents, not 200*3 (Te response is 10MB in this case).

Here is the aggs part of my query:

"clusters": {
  "geohash_grid": {
    "field": "coordinates",
    "precision": 3
  },
  "aggs": {
    "top_hits": {
      "top_hits": {
        "size": 3
      }
    },
    "top_hits_filter": {
      "bucket_selector": {
        "buckets_path": {
          "total_hits": "top_hits._count" // tried top_hits.hits.total
        },
        "script": {
          "inline": "total_hits <= 3"
        }
      }
    }
  }
}

Andrei Stefan · Accepted Answer

Try this @ilivewithian :

  "aggs": {
    "clusters": {
      "geohash_grid": {
        "field": "coordinates",
        "precision": 3
      },
      "aggs": {
        "top_hits": {
          "top_hits": {
            "size": 3
          }
        },
        "top_hits_filter": {
          "bucket_selector": {
            "buckets_path": {
              "total_hits": "_count"
            },
            "script": {
              "inline": "params.total_hits <= 3"
            }
          }
        }
      }
    }
  }

Filtering out top_hits subaggregation based on total documents

Tags:

elasticsearch

elasticsearch-5

elasticsearch-aggregation

mbudnik

1 Answers

Andrei Stefan

Recent Activity

Donate For Us

Filtering out top_hits subaggregation based on total documents

Tags:

elasticsearch

elasticsearch-5

elasticsearch-aggregation

mbudnik

1 Answers

Andrei Stefan

Related questions

Recent Activity

Donate For Us