Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In ElasticSearch, how does sort interact with function_score?

See my search query below with specific questions further down.

search = {
    'query' : {
        'function_score': {
            'score_mode': 'multiply'                                                                                                                                
            'functions': functions,
            'query': {
                'match_all':{}
                },
            'filter': {
                'bool': {
                    'must': filters_include,
                    'must_not': filters_exclude
                    }
                }
            }
        }
    'sort': [{'_score': {'order': 'desc'}},
             {'time': {'order': 'desc'}}]
    }

where functions look like:

[{'weight': 5.0, 'gauss': {'time': {'scale': '7d'}}}, 
 {'weight': 3.0, 'script_score': {'script': "1+doc['scores.year'].value"}}, 
 {'weight': 2.0, 'script_score': {'script': "1+doc['scores.month'].value"}}]

What's happening when I run this query? Are documents being scored by function_score and then sorted after the fact with the sort array? What is _score now (note that query is match_all) and does it do anything in the sorting? If I reversed it and put time before _score in sort, what result should I be expecting?

like image 797
user592419 Avatar asked Jul 17 '15 18:07

user592419


1 Answers

A match_all will give the same score without function_score, meaning each doc gets 1.

With function_score it will compute all three scores (all three matches because you have no filter for each function) and it will multiply them (because you have score_mode: multiply). So, roughly you'll get function1_score * function2_score * function3_score final score. The resulting score will be used in the sort. In case some _scores are equal then the time is used in sorting.

The best for you will be if you take out your query from your application, but it in JSON in Marvel's Sense dashboard for example and test it with ?explain. It will give you detailed explanations for each computation of scores.

Let me give you an example: let's say we have a document containing "year":2015,"month":7,"time":"2015-07-06".

Running your query with _search?explain gives this very detailed explanation:

  "hits": [
     {
        "_shard": 4,
        "_node": "jt4AX7imTECLWH4Bofbk3g",
        "_index": "test",
        "_type": "test",
        "_id": "3",
        "_score": 26691.023,
        "_source": {
           "text": "whatever",
           "year": 2015,
           "month": 7,
           "time": "2015-07-06"
        },
        "sort": [
           26691.023,
           1436140800000
        ],
        "_explanation": {
           "value": 26691.023,
           "description": "function score, product of:",
           "details": [
              {
                 "value": 1,
                 "description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
                 "details": [
                    {
                       "value": 1,
                       "description": "boost"
                    },
                    {
                       "value": 1,
                       "description": "queryNorm"
                    }
                 ]
              },
              {
                 "value": 26691.023,
                 "description": "Math.min of",
                 "details": [
                    {
                       "value": 26691.023,
                       "description": "function score, score mode [multiply]",
                       "details": [
                          {
                             "value": 0.2758249,
                             "description": "function score, product of:",
                             "details": [
                                {
                                   "value": 1,
                                   "description": "match filter: *:*"
                                },
                                {
                                   "value": 0.2758249,
                                   "description": "product of:",
                                   "details": [
                                      {
                                         "value": 0.055164978,
                                         "description": "Function for field time:",
                                         "details": [
                                            {
                                               "value": 0.055164978,
                                               "description": "exp(-0.5*pow(MIN[Math.max(Math.abs(1.4361408E12(=doc value) - 1.437377331833E12(=origin))) - 0.0(=offset), 0)],2.0)/2.63856688924644672E17)"
                                            }
                                         ]
                                      },
                                      {
                                         "value": 5,
                                         "description": "weight"
                                      }
                                   ]
                                }
                             ]
                          },
                          {
                             "value": 6048,
                             "description": "function score, product of:",
                             "details": [
                                {
                                   "value": 1,
                                   "description": "match filter: *:*"
                                },
                                {
                                   "value": 6048,
                                   "description": "product of:",
                                   "details": [
                                      {
                                         "value": 2016,
                                         "description": "script score function, computed with script:\"1+doc['year'].value",
                                         "details": [
                                            {
                                               "value": 1,
                                               "description": "_score: ",
                                               "details": [
                                                  {
                                                     "value": 1,
                                                     "description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
                                                     "details": [
                                                        {
                                                           "value": 1,
                                                           "description": "boost"
                                                        },
                                                        {
                                                           "value": 1,
                                                           "description": "queryNorm"
                                                        }
                                                     ]
                                                  }
                                               ]
                                            }
                                         ]
                                      },
                                      {
                                         "value": 3,
                                         "description": "weight"
                                      }
                                   ]
                                }
                             ]
                          },
                          {
                             "value": 16,
                             "description": "function score, product of:",
                             "details": [
                                {
                                   "value": 1,
                                   "description": "match filter: *:*"
                                },
                                {
                                   "value": 16,
                                   "description": "product of:",
                                   "details": [
                                      {
                                         "value": 8,
                                         "description": "script score function, computed with script:\"1+doc['month'].value",
                                         "details": [
                                            {
                                               "value": 1,
                                               "description": "_score: ",
                                               "details": [
                                                  {
                                                     "value": 1,
                                                     "description": "ConstantScore(BooleanFilter(+cache(year:[1990 TO *]) -cache(month:[13 TO *]))), product of:",
                                                     "details": [
                                                        {
                                                           "value": 1,
                                                           "description": "boost"
                                                        },
                                                        {
                                                           "value": 1,
                                                           "description": "queryNorm"
                                                        }
                                                     ]
                                                  }
                                               ]
                                            }
                                         ]
                                      },
                                      {
                                         "value": 2,
                                         "description": "weight"
                                      }
                                   ]
                                }
                             ]
                          }
                       ]
                    },
                    {
                       "value": 3.4028235e+38,
                       "description": "maxBoost"
                    }
                 ]
              },
              {
                 "value": 1,
                 "description": "queryBoost"
              }
           ]
        }
     }

So, for gauss the score computed is 0.055164978. I don't know how relevant this is for your question, but let's assume the computation is correct :-). Your gauss function weight is 5, so the score becomes 5 * 0.055164978 = 0.27582489.

For the script year function we have (1 + 2015) * 3 = 6048.

For the script month function we have (1 + 7) * 2 = 16.

The total being multiply the total score for this document is 0.27582489 * 6048 * 16 = 26691.023

There is, also, a section for each document that shows what values were used for sorting. In this document's case:

        "sort": [
           26691.023,
           1436140800000
        ]

The first number is the _score computed as shown, the second is the milliseconds representation of date 2015-07-06.

like image 156
Andrei Stefan Avatar answered Oct 20 '22 21:10

Andrei Stefan