Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confusions about the Elasticsearch json dsl query structure

In many places of the elasticsearch dsl query grammar documentation, the wrapper json queries are skipped out in explanations probably to reduce documentation size. But its been confusing as I've been navigating the documentation. What are officially the rules for what can or should go where in a json query? In other words, I'm trying to find the standard or pattern common across all elastic queries because I need to build an internal api to query elastic. Is there a template that contains all of the grammar components "query': {} inside a "bool":{} or a filter etc. in which I can just fill in the relevant parts and it still runs?

like image 879
Horse Voice Avatar asked Aug 03 '15 15:08

Horse Voice


1 Answers

I also find Elastic's DSL structure confusing, but after running hundreds of queries you get used to it.

Here are a few (full) examples of different types of queries, hopefully this will help clear some questions you may have, feel free to add scenarios in a comment and I'll add more examples.

This is how a standard query looks like:

{
    "query": {
        "bool": {
            "must": {
                "match": {
                    "message": "abcd"
                }
            }
        }
    }
}

However, this is how a filtered query looks like, you'll notice a change in structure when filtering elasticsearch:

{
    "query": {
        "filtered": {
            "filter": {
                "term": {
                    "message": "abcd"
                }
            }
        }
    }
}

(Read more about the difference between Filters and Queries)

Here's how a query that has both filters and queries look like:

{
    "query": {
        "filtered": {
            "filter": {
                "term": {
                    "message": "abcd"
                }
            },
            "query": {
                "bool": {
                    "must": {
                        "match": {
                            "message2": "bbbb"
                        }
                    }
                }
            }
        }
    }
}

Here's how you run a filter with multiple conditions:

{
    "query": {
        "filtered": {
            "filter": {
                "and": [
                    {
                        "term": {
                            "message": "abcd"
                        }
                    },
                    {
                        "term": {
                            "message2": "abcdd"
                        }
                    }
                ]
            }
        }
    }
}

And a more complex filter:

{
    "query": {
        "filtered": {
            "filter": {
                "and": [
                    {
                        "term": {
                            "message": "abcd"
                        }
                    },
                    {
                        "term": {
                            "message2": "abcdd"
                        }
                    },
                    {
                        "or": [
                            {
                                "term": {
                                    "message3": "abcddx"
                                }
                            },
                            {
                                "term": {
                                    "message4": "abcdd2"
                                }
                            }
                        ]
                    }
                ]
            }
        }
    }
}

Simple query with aggregations:

{
    "query": {
        "filtered": {
            "filter": {
                "term": {
                    "message": "abcd"
                }
            }
        }
    },
    "aggs": {
        "any_name_will_work_here": {
            "max": {
                "field": "metric1"
            }
        }
    }
}

A query_string query:

{
    "query": {
        "query_string": {
            "default_field": "message",
            "query": "this AND that"
        }
    }
}

Some other things to consider when using the DSL:

  1. You can add a size parameter at the top level (above the query) which will determine the amount of results to return. If you want JUST doc counts you can use "size": 0 which will not get any results, just the meta data.
  2. However, when using aggs the size parameter has a twist, setting "size": 0 inside the aggs field will tell ES to get ALL aggregation buckets
  3. The DSL structure has exceptions, in my examples I usually used terms, but range for example has a bit of a different structure.
like image 185
Or Weinberger Avatar answered Nov 08 '22 18:11

Or Weinberger