Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the use of Field.Set OmitNorms(true); in lucene

Tags:

.net

lucene

I have been suggested to use Field.Set OmitNorms(true); when creating the documents for lucenesearch to sort the result according to the number of hits, but I am not clear of what it does and is it safe.

sort the result according to the number of hits means that the document in which search text is found maximum number of times should come on the top followed b the ones with less number of match for search text.

I know its silly but I want to know before I implement this please help.

like image 752
Pranali Desai Avatar asked Aug 27 '09 08:08

Pranali Desai


2 Answers

Check out this article for a good paragraph description of what omit norms does in term of optimisation. Basically its kind of like having a mini lucene index for the terms inside of a field, so its really only useful for fields that would have a lot of text inside them.

like image 94
Maks Avatar answered Dec 06 '22 14:12

Maks


By default, a field is indexed with its norm, a product of the document's boost, the field's boost, and the field's length normalisation factor (see Similarity scoring). This adds a byte to the storage and memory consumption of every field, which can be ommited for selected fields or field types using omitNorms.

The boosts are specified by during indexing, while lengthNorm is calculated so that if two documents match a query term f times, the longer document will get a lower score.

So if you want your documents to be scored based on the exact number of terms matched, rather than the number of terms in proportion to the document length, use omitNorms (and get the memory consumption benefits free).

like image 23
joeln Avatar answered Dec 06 '22 13:12

joeln