I have been suggested to use Field.Set OmitNorms(true); when creating the documents for lucenesearch to sort the result according to the number of hits, but I am not clear of what it does and is it safe.
sort the result according to the number of hits means that the document in which search text is found maximum number of times should come on the top followed b the ones with less number of match for search text.
I know its silly but I want to know before I implement this please help.
Check out this article for a good paragraph description of what omit norms does in term of optimisation. Basically its kind of like having a mini lucene index for the terms inside of a field, so its really only useful for fields that would have a lot of text inside them.
By default, a field is indexed with its norm
, a product of the document's boost, the field's boost, and the field's length normalisation factor (see Similarity scoring). This adds a byte to the storage and memory consumption of every field, which can be ommited for selected fields or field types using omitNorms
.
The boosts are specified by during indexing, while lengthNorm
is calculated so that if two documents match a query term f
times, the longer document will get a lower score.
So if you want your documents to be scored based on the exact number of terms matched, rather than the number of terms in proportion to the document length, use omitNorms
(and get the memory consumption benefits free).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With