I would like to perform searches in elasticsearch ignoring the field-norm in the tf-idf search. You can accomplish this by ignoring the field norms by setting the index mappings. However it seems that this is accomplished by changes to the indexing, I just want to modify the search (I need the norms for other types of searches). What is the best way to accomplish this? I'm using elasticsearch.js as my interface to elasticsearch.
You can't disable norms on a per-search basis, but you can use the Multi Fields API to add an additional field where the norms are disabled.
PUT /my_index
{
"mappings": {
"my_type": {
"properties": {
"my_field": {
"type": "string",
"fields": {
"no_norms": {
"type": "string",
"norms": {
"enabled": false
}
}
}
}
}
}
}
}
Now you can search on my_field
if you need norms and on my_field.no_norms
if you don't. You have to reindex the data in order for the new field to be available for all documents, just adding it to the mapping won't change anything for exiting docs.
So this is the approach I ended up using. Instead of using tf-idf (current elasticsearch default) I used BM25 which is supposedly better. Also, it has a parameter "b" that represents the importance of field length norm. For "b=0" the field length norm is ignored while the default value is 0.75. A discussion of BM25 can be found here. Inside my elasticsearch.yml I have
index :
similarity:
default:
type: BM25
b: 0.0
k1: 1.2
norm_bm25:
type: BM25
b: 0.75
k1: 1.2
For those who use the elasticsearch javascript api, the custom similarity can then be defined during index creation
client.indices.create({
index: "db",
body: {
settings: {
number_of_shards: 1,
similarity : "norm_bm25"
}
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With