I've got the following simple mapping:
"element": {
"dynamic": "false",
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"group": { "type": "string", "index": "not_analyzed" },
"type": { "type": "string", "index": "not_analyzed" }
}
}
Which basically is a way to store Group
object:
{
id : "...",
elements : [
{id: "...", type: "..."},
...
{id: "...", type: "..."}
]
}
I want to find how many different groups exist sharing the same set of element types (ordered, including repetitions).
An obvious solution would be to change the schema to:
"element": {
"dynamic": "false",
"properties": {
"group": { "type": "string", "index": "not_analyzed" },
"concatenated_list_of_types": { "type": "string", "index": "not_analyzed" }
}
}
But, due to the requirements, we need to be able to exclude some types from group by (aggregation) :(
All fields of the document are mongo ids, so in SQL I would do something like this:
SELECT COUNT(id), concat_value FROM (
SELECT GROUP_CONCAT(type_id), group_id
FROM table
WHERE type_id != 'some_filtered_out_type_id'
GROUP BY group_id
) T GROUP BY concat_value
In Elastic with given mapping it's really easy to filter out, its also not a problem to count assuming we have a concated value. Needless to say, sum aggregation does not work for strings.
How can I get this working? :)
Thanks!
Finally I solved this problem with scripting and by changing the mapping.
{
"mappings": {
"group": {
"dynamic": "false",
"properties": {
"id": { "type": "string", "index": "not_analyzed" },
"elements": { "type": "string", "index": "not_analyzed" }
}
}
}
}
There are still some issues with duplicate elements in array (ScriptDocValues.Strings) for some reason strips out dups, but here's an aggregation that counts by string concat:
{
"aggs": {
"path": {
"scripted_metric": {
"map_script": "key = doc['elements'].join('-'); _agg[key] = _agg[key] ? _agg[key] + 1 : 1",
"combine_script": "_agg",
"reduce_script": "_aggs.collectMany { it.entrySet() }.inject( [:] ) { result, e -> result << [ (e.key):e.value + ( result[ e.key ] ?: 0 ) ]}"
}
}
}
}
The result would be as follows:
"aggregations" : {
"path" : {
"value" : {
"5639abfb5cba47087e8b457e" : 362,
"568bfc495cba47fc308b4567" : 3695,
"5666d9d65cba47701c413c53" : 14,
"5639abfb5cba47087e8b4571-5639abfb5cba47087e8b457b" : 1,
"570eb97abe529e83498b473d" : 1
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With