Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elastic Search: use filter and should bool query

I want to use elastic search to search through a large address database, and to make it like some other applications I start with the postcode first which is great to narrow down on the rest of the search query.

So with Search::Elasticsearch

I do

my $scroll = $e->scroll_helper(index => 'pdb', search_type => 'scan', size => 100,
    body => {
        query => {
            bool => { 
                filter => [
                    {match => { pcode => $postcode }},
                ],
                should => [
                    {match => { address => $keyword }},
                    {match => { name => $keyword }},
                ],
            }
        }
    }
);

However that just spits out everything for $postcode and regardless of what $keyword is the result set is not further reduced.

I need to have $postcode as a mandatory condition but also separately and in addition the other two fields to also be taken into account as a full text search. How should I do this (Im looking at the docs and might be interpreting json->perl hashrefs wrong so any suggestions welcome)

For a hypothetical example: User enters NW1 4AQ, The above query will immediately return, say, Albany Street and Portland Street, if the user queries Portland and that postcode, instead of getting both those results, I expect only Portland Street to be the result. Right now with the above It just keeps returning both entries.

like image 977
Recct Avatar asked Jun 14 '16 16:06

Recct


People also ask

What is bool query in Elasticsearch?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

Should and must not Elasticsearch?

Using must_not tells Elasticsearch that document matches cannot include any of the queries that fall under the must_not clause. should – It would be ideal for the matching documents to include all of the queries in the should clause, but they do not have to be included. Scoring is used to rank the matches.

How do I filter Elasticsearch results?

You can use two methods to filter search results: Use a boolean query with a filter clause. Search requests apply boolean filters to both search hits and aggregations. Use the search API's post_filter parameter.


2 Answers

Following common sense I found that the following does what I want for the bool segment:

bool => { 
            must => [
                {match => { pcode => $postcode }},
            ],
            should => [
                {match => { address => $keyword }},
                {match => { name => $keyword }},
            ],
                minimum_should_match => 1,
        }

Having minimum_should_match as 1 (which is a counter rather than true/false), feels like it's inserting an OR in those shoulds

like image 99
Recct Avatar answered Oct 16 '22 12:10

Recct


Elastic doc says:

"By default, none of the should clauses are required to match, with one exception: if there are no must clauses, then at least one should clause must match. Just as we can control the precision of the match query, we can control how many should clauses need to match by using the minimum_should_match parameter, either as an absolute number or as a percentage"

So the way to do it is through minimum_should_match. Just as you did. What you did means that either address or name must be matched.

like image 20
israelst Avatar answered Oct 16 '22 12:10

israelst