How can I merge rankings from several Elasticsearch queries?

Question

I would like to merge the rankings obtained from querying separate fields of an Elasticsearch index, so to obtain a "compound" ranking.

As a (silly) "matchmaking" example, suppose I wanted to retrieve best-matching results on an index of people containing their favorite music, food, sports.

The separate queries could be e.g.

"query": { "match" : { "music" : "indie classical metal" } }

which would yield me as ranked results:

Alice, 2. Bob, 3. Charlie;

"query": { "match" : { "foods" : "falafel strawberries coffee" } }

yielding

Alice, 2. Charlie, 3. Bob;

and

"query": { "match" : { "sports" : "basketball ski" } }

yielding

Charlie, 2. Alice, 3. Bob.

Now, I would like to obtained an "aggregate" ranking based on the rankings above, e.g. using the voting methods listed in How to merge a collection of ordered preferences.

So far, to achieve something along these lines I used syntax for compound queries such as

"query": {
   "bool": {
        "should": [
                { "match" : { "music" : "indie classical metal" } },
                { "match" : { "foods" : "falafel strawberries coffee" } },
                { "match" : { "sports" : "basketball ski" } },
        ]
    }
 }

or

"query": {
   "dis_max": {
        "queries": [
                { "match" : { "music" : "indie classical metal" } },
                { "match" : { "foods" : "falafel strawberries coffee" } },
                { "match" : { "sports" : "basketball ski" } },
        ]
    }
 }

but (AFAIK) these don't do what I am looking for (which is not using scores, but ranks). I understand that's fairly straightforward to post-process the rankings (e.g. using elasticsearch-py and then a few Python lines), but is it possible to do the things above directly with an Elasticsearch query?

(bonus question: could you suggest alternative strategies to merge rankings from multiple fields, beyond bool+should and dis_max that I could try out?)

G0l0s · Accepted Answer

Answer #2. Pure Scripting

(See the document score model and the first strategy in Answer #1)

The second strategy is pure scripting

Mapping

PUT /ranking_people_scripted
{
    "mappings": {
        "properties": {
            "name": {
                "type": "keyword"
            },
            "music": {
                "type": "keyword"
            },
            "foods": {
                "type": "keyword"
            },
            "sports": {
                "type": "keyword"
            }
        }
    }
}

Documents (see Answer #1)

Ranking scripted query

GET /ranking_people_scripted/_search?filter_path=hits.hits
{
    "query": {
        "script_score": {
            "query": {
                "match_all": {}
            },
            "script": {
                "source": """
                    int calculateFieldScore(List fieldTerms, List queryTerms) {
                        def fieldScore = 0;
                        for (def queryTerm : queryTerms) {
                            if (fieldTerms.contains(queryTerm)) {
                                fieldScore++;
                            }
                        }
                        return fieldScore;
                    }
                    
                    def documentScore = 0;
                    def termSets = params.term_sets;
                    
                    for (def termSet : termSets) {
                        def queryTerms = termSet.terms;
                        def field = termSet.field;
                        def fieldBoost = termSet.boost;
                        def fieldTerms = doc[field];
                        
                        int fieldScore = calculateFieldScore(fieldTerms, queryTerms);
                        
                        documentScore += fieldScore * fieldBoost;
                    }
                    return documentScore;
                """,
                "params": {
                    "term_sets": [
                        {
                            "terms": [
                                "indie",
                                "classical"
                            ],
                            "field": "music",
                            "boost": 1
                        },
                        {
                            "terms": [
                                "strawberries",
                                "coffee"
                            ],
                            "field": "foods",
                            "boost": 1
                        },
                        {
                            "terms": [
                                "hockey",
                                "basketball"
                            ],
                            "field": "sports",
                            "boost": 1
                        }
                    ]
                }
            }
        }
    },
    "fields": [
        "name"
    ],
    "_source": false
}

Response

{
    "hits" : {
        "hits" : [
            {
                "_index" : "ranking_people_scripted",
                "_type" : "_doc",
                "_id" : "1",
                "_score" : 5.0,
                "fields" : {
                    "name" : [
                        "Alice"
                    ]
                }
            },
            {
                "_index" : "ranking_people_scripted",
                "_type" : "_doc",
                "_id" : "3",
                "_score" : 4.0,
                "fields" : {
                    "name" : [
                        "Charlie"
                    ]
                }
            },
            {
                "_index" : "ranking_people_scripted",
                "_type" : "_doc",
                "_id" : "2",
                "_score" : 3.0,
                "fields" : {
                    "name" : [
                        "Bob"
                    ]
                }
            }
        ]
    }
}

You also could script a runtime field or a script query

How can I merge rankings from several Elasticsearch queries?

Tags:

python

elasticsearch

Davide Fiocco

1 Answers

Answer #2. Pure Scripting

G0l0s

Recent Activity

Donate For Us

How can I merge rankings from several Elasticsearch queries?

Tags:

python

elasticsearch

Davide Fiocco

1 Answers

Answer #2. Pure Scripting

G0l0s

Related questions

Recent Activity

Donate For Us