Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recommended title boost?

Tags:

solr

lucene

I have a relatively simple Lucene index, being served by Solr. The index consists of two major fields, title and body, and a few less-important fields.

Most search engines give more relevance to results with matches in the title, over the body. I'm going to start providing an index-time boost to the title field.

My question is, what values do people typically use for their title fields? 2? 4? 10? 100?

like image 954
Frank Farmer Avatar asked Jun 22 '26 15:06

Frank Farmer


1 Answers

I suggest you divide the median body length by the median title length. This roughly gives you a factor M - for M appearances of a word in the body, it will appear once in the title. Now, use something like M*3. This is, of course, a rationalized heuristic, and it is best you iterate over the values. See Grant Ingersoll's "Debugging Relevance Issues in Search" for a much more structured discussion.

like image 160
Yuval F Avatar answered Jun 28 '26 17:06

Yuval F



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!