Elasticsearch: Influence scoring with custom score field in document

Tags:

elasticsearch

I have a set of words extracted out of text through NLP algos, with associated score for each word in every document.

For example :

document 1: {  "vocab": [ {"wtag":"James Bond", "rscore": 2.14 }, 
                          {"wtag":"world", "rscore": 0.86 }, 
                          ...., 
                          {"wtag":"somemore", "rscore": 3.15 }
                        ] 
            }

document 2: {  "vocab": [ {"wtag":"hiii", "rscore": 1.34 }, 
                          {"wtag":"world", "rscore": 0.94 },
                          ...., 
                          {"wtag":"somemore", "rscore": 3.23 } 
                        ] 
            }

I want rscores of matched wtag in each document to affect the _score given to it by ES, maybe multiplied or added to the _score, to influence the final _score (in turn, order) of the resulting documents. Is there any way to achieve this?

327

asked Jan 29 '14 18:01

Haywire

1 Answers

Another way of approaching this would be to use nested documents:

First setup the mapping to make vocab a nested document, meaning that each wtag/rscore document would be indexed internally as a separate document:

curl -XPUT "http://localhost:9200/myindex/" -d'
{
  "settings": {"number_of_shards": 1}, 
  "mappings": {
    "mytype": {
      "properties": {
        "vocab": {
          "type": "nested",
          "fields": {
            "wtag": {
              "type": "string"
            },
            "rscore": {
              "type": "float"
            }
          }
        }
      }
    }
  }
}'

Then index your docs:

curl -XPUT "http://localhost:9200/myindex/mytype/1" -d'
{
  "vocab": [
    {
      "wtag": "James Bond",
      "rscore": 2.14
    },
    {
      "wtag": "world",
      "rscore": 0.86
    },
    {
      "wtag": "somemore",
      "rscore": 3.15
    }
  ]
}'

curl -XPUT "http://localhost:9200/myindex/mytype/2" -d'
{
  "vocab": [
    {
      "wtag": "hiii",
      "rscore": 1.34
    },
    {
      "wtag": "world",
      "rscore": 0.94
    },
    {
      "wtag": "somemore",
      "rscore": 3.23
    }
  ]
}'

And run a nested query to match all the nested documents and add up the values of rscore for each nested document which matches:

curl -XGET "http://localhost:9200/myindex/mytype/_search" -d'
{
  "query": {
    "nested": {
      "path": "vocab",
      "score_mode": "sum",
      "query": {
        "function_score": {
          "query": {
            "match": {
              "vocab.wtag": "james bond world"
            }
          },
          "script_score": {
            "script": "doc[\"rscore\"].value"
          }
        }
      }
    }
  }
}'

answered Sep 29 '22 04:09

DrTech

Related questions
                            
                                What is the proper way to unit test Service with NestJS/Elastic
                            
                                How can elasticsearch objects be boosted based on date or score
                            
                                ElasticSearch terms aggregation by entire field
                            
                                What does disable_coord parameter for boolean queries mean?
                            
                                Check if elasticsearch index is open or closed
                            
                                How to parse json in logstash /grok from a text file line?
                            
                                Customize the information in an alert received by elastalert plugin for elasticsearch
                            
                                Range query in ElasticSearch (GET without body)
                            
                                ElasticSearch entered "read only" mode, node cannot be altered
                            
                                Sorting by multiple params in pyes and elasticsearch
                            
                                How to filter last 5 minutes, date histogram using Elastic search?
                            
                                ElasticSearch - sort search results by relevance and custom field (Date)
                            
                                elasticsearch mapping tokenizer keyword to avoid splitting tokens and enable use of wildcard
                            
                                Elasticsearch query must not match text from field
                            
                                Fuzziness settings in ElasticSearch
                            
                                How to filter or query list of index names in Elasticsearch?
                            
                                Understanding elasticsearch jvm heap usage
                            
                                how to remove arraylist value in elastic search using curl?
                            
                                elasticsearch search for elements with specified ID example
                            
                                Creating a mapping for an existing index with a new type in elasticsearch

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With