Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch DateHistogram Aggregation Fill Missing Data

I'm trying to use ElasticSearch spring data for some aggregations

Here Is my query

final FilteredQueryBuilder filteredQuery = QueryBuilders.filteredQuery(QueryBuilders.matchAllQuery(),
      FilterBuilders.andFilter(FilterBuilders.termFilter("gender", "F"),
      FilterBuilders.termFilter("place", "Arizona"),
      FilterBuilders.rangeFilter("dob").from(from).to(to)));

final MetricsAggregationBuilder<?> aggregateArtifactcount = AggregationBuilders.sum("delivery")
            .field("birth");

    final AggregationBuilder<?> dailyDateHistogarm =
       AggregationBuilders.dateHistogram(AggregationConstants.DAILY).field("dob")
        .interval(DateHistogram.Interval.DAY).subAggregation(aggregateArtifactcount);

    final SearchQuery query = new NativeSearchQueryBuilder().withIndices(index).withTypes(type)
        .withQuery(filteredQuery).addAggregation(dailyDateHistogarm).build();

    return elasticsearchTemplate.query(query, new DailyDeliveryAggregation());

Also this is my Aggregation

        public class DailyDeliveryAggregation implements ResultsExtractor<List<DailyDeliverySum>> {

@SuppressWarnings("unchecked")
@Override
public List<DailyDeliverySum> extract(final SearchResponse response) {
    final List<DailyDeliverySum> dailyDeliverySum = new ArrayList<DailyDeliverySum>();
    final Aggregations aggregations = response.getAggregations();
    final DateHistogram daily = aggregations.get(AggregationConstants.DAILY);
    final List<DateHistogram.Bucket> buckets = (List<DateHistogram.Bucket>) daily.getBuckets();
    for (final DateHistogram.Bucket bucket : buckets) {
        final Sum sum = (Sum) bucket.getAggregations().getAsMap().get("delivery");
        final int deliverySum = (int) sum.getValue();
        final int delivery = (int) bucket.getDocCount();
        final String dateString = bucket.getKeyAsText().string();
        dailyDeliverySum.add(new DailyDeliverySum(deliverySum, delivery, dateString));
    }
    return dailyDeliverySum;
}
}

It gives me the correct data , But It doesn't satisfy all my needs Suppose if I query for time range of 10 days , If there is no data for a date in the given time range It miss that date in Date histogram buckets ,But I want to set 0 as default value for aggregation and doc count if there is no data available

Is there any way to do it ??

like image 246
edwin Avatar asked Jun 28 '26 22:06

edwin


1 Answers

Yes, you can use the "minimum document count" feature of the date_histogram aggregation and set it to 0. That way, you'll also get buckets that don't contain any data:

final AggregationBuilder<?> dailyDateHistogarm =
   AggregationBuilders.dateHistogram(AggregationConstants.DAILY)
        .field("dob")        
        .minDocCount(0)                          <--- add this line
        .interval(DateHistogram.Interval.DAY)
        .subAggregation(aggregateArtifactcount);
like image 175
Val Avatar answered Jun 30 '26 16:06

Val