I am trying to boost particular documents based on a field value. It is generally working ok but some documents return a higher score even though they have a smaller boost value.
After debugging the query with the debugQuery=on
request parameter I have noticed that the idf
function is returning a higher score for a particular document, which is affecting the overall score.
Is there a way to ignore tf/idf scoring at query time?
Lucene's default ranking function uses factors such as tf, idf, and norm to help calculate relevancy scores. Solr has now exposed these factors as function queries.
Term frequency-inverse document frequency (TF-IDF) term vectors are often used to represent text documents when performing text mining and machine learning operations. The math expressions library can be used to perform text analysis and create TF-IDF term vectors.
TF-IDF (term frequency-inverse document frequency) is an information retrieval technique that helps find the most relevant documents corresponding to a given query. TF is a measure of how often a phrase appears in a document, and IDF is about how important that phrase is.
Only tf(life) depends on the query itself. However, the idf of a query depends on the background documents, so idf(life) = 1+ ln(3/2) ~= 1.405507153. That is why tf-idf is defined as multiplying a local component (term frequency) with a global component (inverse document frequency).
You'll want to create a custom Similarity which overrides the tf and idf methods, and use it in place of the DefaultSimilarity.
Something like:
class CustomSimilarity extends DefaultSimilarity {
@Override
public float tf(float freq) {
return 1.0;
}
@Override
public float tf(int freq) {
return 1.0;
}
@Override
// Note the signature of this method may now take longs:
// public float idf(long docFreq, long numDocs)
public float idf(int docFreq, int numDocs) {
return 1.0;
}
}
The set it to use that similarity in your schema.xml:
<similarity class="myorg.mypackage.CustomSimilarity"/>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With