I know that elasticsearch takes in account the length of a field when computing the score of the documents retrieved by a query. The shorter the field, the higher the weight (see The field-length norm). I like this behaviour: when I search for <code>iphone</code> I am much more interested in <code>iphone 6</code> than in <code>Crappy accessories for: iphone 5 iphone 5s iphone 6</code>. Now, I would like to try to boost this stuff, let's say that I want to double its importance. I know that one can modify the score using the function score, and I guess that I can achieve what I want via script score. I tried to add another field-length norm to the score like this: <pre class="prettyprint"><code> { "query": { "function_score": { "boost_mode": "replace", "query": {...}, "script_score": { "script": "_score + norm(doc)" } } } } </code></pre> But I failed badly, getting this error: <code>[No parser for element [function_score]]</code> EDIT: My first error was that I hadn't wrapped the function score in a "query". Now I edited the code above. My new error says <pre class="prettyprint"><code>GroovyScriptExecutionException[MissingMethodException [No signature of method: Script5.norm() is applicable for argument types: (org.elasticsearch.search.lookup.DocLookup) values: [<org.elasticsearch.search.lookup.DocLookup@2c935f6f>] Possible solutions: notify(), wait(), run(), run(), dump(), any()]] </code></pre> EDIT: I provided a first answer, but I'm hoping for a better one

It looks like you could achieve that using a field of type <code>token_count</code> together with a <code>field_value_factor</code> function score. So, something like this in the field mapping: <pre class="prettyprint"><code>"name": { "type": "string", "fields": { "length": { "type": "token_count", "analyzer": "standard" } } } </code></pre> This will use the number of tokens in the field. If you want to use the number of characters, you can change the analyzer from <code>standard</code> to a custom one that tokenizes each character. Then in the query: <pre class="prettyprint"><code>"function_score": { ..., "field_value_factor": { "field": "name.length", "modifier": "reciprocal" } } </code></pre>

How can I boost the field length norm in elasticsearch function score?

Tags:

elasticsearch

boosting

I know that elasticsearch takes in account the length of a field when computing the score of the documents retrieved by a query. The shorter the field, the higher the weight (see The field-length norm).

I like this behaviour: when I search for iphone I am much more interested in iphone 6 than in Crappy accessories for: iphone 5 iphone 5s iphone 6.

Now, I would like to try to boost this stuff, let's say that I want to double its importance.

I know that one can modify the score using the function score, and I guess that I can achieve what I want via script score.

I tried to add another field-length norm to the score like this:

    {
     "query": {
       "function_score": {
         "boost_mode": "replace",
         "query": {...},
         "script_score": {
             "script": "_score + norm(doc)"
         }
       }
     }
   }

But I failed badly, getting this error: [No parser for element [function_score]]

EDIT:

My first error was that I hadn't wrapped the function score in a "query". Now I edited the code above. My new error says

GroovyScriptExecutionException[MissingMethodException
[No signature of method: Script5.norm() is applicable for argument types:
(org.elasticsearch.search.lookup.DocLookup) values: 
[<org.elasticsearch.search.lookup.DocLookup@2c935f6f>]
Possible solutions: notify(), wait(), run(), run(), dump(), any()]]

EDIT: I provided a first answer, but I'm hoping for a better one

918

asked Aug 17 '15 21:08

Mario Trucco

2 Answers

It looks like you could achieve that using a field of type token_count together with a field_value_factor function score.

So, something like this in the field mapping:

"name": { 
  "type": "string",
  "fields": {
    "length": { 
      "type":     "token_count",
      "analyzer": "standard"
    }
  }
}

This will use the number of tokens in the field. If you want to use the number of characters, you can change the analyzer from standard to a custom one that tokenizes each character.

Then in the query:

"function_score": {
  ...,
  "field_value_factor": {
    "field": "name.length",
    "modifier": "reciprocal"
  }
}

142

answered Oct 08 '22 04:10

robinst

I have something that kind of works. With the following, I deduct the length of a field of my interest from the score.

{
 "query": {
   "function_score": {
     "boost_mode": "replace",
     "query": {...},
     "script_score": {
         "script": "_score  - doc['<field_name>'].value.length()"
     }
   }
 }
}

Hovever, I cannot control the relative weight of this number I am subtracting, compared to the old score. That's why I am not accepting my answer: I'll wait for better ones for a while. Ideally, I'd love to have a way to access the field length norm function within the script_score, or to get an equivalent result.

answered Oct 08 '22 03:10

Mario Trucco

Related questions
                            
                                Role issue using AWS ElasticSearch with S3
                            
                                How to change default elasticsearch password in docker-compose?
                            
                                Is there a performance difference between running elasticsearch on Linux or Windows? [closed]
                            
                                Elasticsearch cache clear doesn't seems to do what I expected
                            
                                Too many fields bad for elasticsearch index?
                            
                                Elasticsearch: Remove duplicates from index
                            
                                How to query an Elasticsearch index using Pyspark and Dataframes
                            
                                How to speed up Elasticsearch recovery?
                            
                                elasticsearch-dsl aggregations returns only 10 results. How to change this
                            
                                How can I do a fuzzy search using django-haystack and the elasticsearch backend?
                            
                                spring-data-elasticsearch - registering custom analyser
                            
                                Can't connect to Elasticsearch with Node.Js on Kubernetes (self signed certificate in certificate chain)
                            
                                Which field matched query in multi_match search in Elasticsearch?
                            
                                ElasticSearch Order By String Length
                            
                                User Interface for Elasticsearch
                            
                                Filebeat vs Rsyslog for forwarding logs
                            
                                NEST mapping of Dictionary<string,object>
                            
                                AWS Elasticsearch VPC connectivity
                            
                                Elasticsearch GET just after POST
                            
                                List all available indices via Java API [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I boost the field length norm in elasticsearch function score?

Tags:

elasticsearch

boosting

Mario Trucco

People also ask

2 Answers

robinst

Mario Trucco

Recent Activity

Donate For Us