I've been experimenting with fulltext search lately and am curious about the meaning of the Score value. For example I have the following query:
SELECT table. * ,
MATCH (
col1, col2, col3
)
AGAINST (
'+(Term1) +(Term1)'
) AS Score
FROM table
WHERE MATCH (
col1, col2, col3
)
AGAINST (
'+(Term1) +(Term1)'
)
In the results for Score I've seen results, for one query, between 0.4667041301727 to 11.166275978088. I get that it's MySQLs idea of relevance (the higher the more weight).
What I don't get is how MySQL comes up with that score. Why is the number not returned as a decimal or something besides ?
How come if I run a query "IN BOOLEAN MODE" does the score always return a 1 or a 0 ? Wouldn't all the results be a 1?
Just hoping for some enlightenment. Thanks.
Take the query "word1 word2" as an example.
BOOLEAN mode indicates that your entire query matches the document (e.g. it contains both word1 AND word2). Boolean mode is a strict match.
The formula normally used is based on the Vector Space Model of searching. Very simplified, it figures out two measures to determine how important a word is to a query. The term frequency (terms that occur often in a document are more important than other terms) and the inverse document frequency (a term that occurs in many documents is weighted lower than a term that occurs in few documents). This is known as tf-idf, and is used as a basis for the vector space model. These scores form the basis for the Vector Space Model, which someone else can explain thoroughly. :)
Generally relevance is based on how many matches each row has to the words given to the search. The exact value will depend on many things, but it really only matters for comparing to other relevance values in the same query.
If you really want the math behind it, you can find it at the internals manual.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With