I have a use case in which I have data like
{
name: "John",
parentid": "1234",
filter: {a: '1', b: '3', c: '4'}
},
{
name: "Tim",
parentid": "2222",
filter: {a: '2', b: '1', c: '4'}
},
{
name: "Mary",
parentid": "1234",
filter: {a: '1', b: '3', c: '5'}
},
{
name: "Tom",
parentid": "2222",
filter: {a: '1', b: '3', c: '1'}
}
expected results:
bucket:[{
key: "2222",
hits: [{
name: "Tom" ...
},
{
name: "Tim" ...
}]
},
{
key: "1234",
hits: [{
name: "John" ...
},
{
name: "Mary" ...
}]
}]
I want to return unique document by parentid
. Although I can use top aggregation but I don't how can I paginate the bucket. As there is more chance of parentid
being different than same. So mine bucket array would be large and I want to show all of them but by paginating them.
How can you get distinct values of a field in Elasticsearch? Elasticsearch is a powerful search engine that can be used to get distinct values of a field. To do this, you can use the "terms" aggregation. This will return a list of all the unique values of the field, in order of popularity.
Set you aggregation back to count and have a Split Rows as follows. Not doing this will give you count 1 for each field value (since it is looking for unique counts) when you populate the table. Noteworthy part is setting the Top field to 0. Because Kibana won't let you enter anything else than a digit (Obviously!).
One solution will be to use uniqueId field value for specifying document ID and use op_type=create while storing the documents in ES. With this you can make sure your uniqueId field will have unique value and will not be overridden by another same valued document.
You can use Visual Builder to show the amount of duplicates by bucket. So the metric will show the amount of duplicates in the latest time interval. If you want to show a total number of duplicates, the accurate way would be to increase the bucket so much that it basically contains all the data.
There is no direct way of doing this. But you can follow these steps to get desired result.
Step 1. You should know all parentid
. This data can be obtained by doing a simple terms aggregation
(Read more here) on field parentid
and you will get only the list of parentid
, not the documents matching to that. In the end you will have a smaller array on than you are currently expectig.
{
"aggs": {
"parentids": {
"terms": {
"field": "parentid",
"size": 0
}
}
}
}
size: 0
is required to return all results. Read more here.
OR
If you already know list of all parentid
then you can directly move to step 2.
Step 2. Fetch related documents by filtering documents by parentid
and here you can apply pagination.
{
"from": 0,
"size": 20,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"term": {
"parentid": "2222"
}
}
}
}
}
from
and size
are used for pagination, so you can loop through each of parentid
in the list and fetch all related documents.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With