Suppose I have two columns, keywords and content. I have a fulltext index across both. I want a row with foo in the keywords to have more relevance than a row with foo in the content. What do I need to do to cause MySQL to weight the matches in keywords higher than those in content?
I'm using the "match against" syntax.
SOLUTION:
Was able to make this work in the following manner:
SELECT *, CASE when Keywords like '%watermelon%' then 1 else 0 END as keywordmatch, CASE when Content like '%watermelon%' then 1 else 0 END as contentmatch, MATCH (Title, Keywords, Content) AGAINST ('watermelon') AS relevance FROM about_data WHERE MATCH(Title, Keywords, Content) AGAINST ('watermelon' IN BOOLEAN MODE) HAVING relevance > 0 ORDER by keywordmatch desc, contentmatch desc, relevance desc
Full-text searching is performed using MATCH() AGAINST() syntax. MATCH() takes a comma-separated list that names the columns to be searched. AGAINST takes a string to search for, and an optional modifier that indicates what type of search to perform.
Full-Text Search in MySQL server lets users run full-text queries against character-based data in MySQL tables. You must create a full-text index on the table before you run full-text queries on a table. The full-text index can include one or more character-based columns in the table.
Full-text indexes are created on text-based columns ( CHAR , VARCHAR , or TEXT columns) to speed up queries and DML operations on data contained within those columns. A full-text index is defined as part of a CREATE TABLE statement or added to an existing table using ALTER TABLE or CREATE INDEX .
To drop a FULLTEXT index, you use the ALTER TABLE DROP INDEX statement. In this tutorial, you have shown you how to create FULLTEXT indexes that support full-text search in MySQL.
Create three full text indexes
Then, your query:
SELECT id, keyword, content, MATCH (keyword) AGAINST ('watermelon') AS rel1, MATCH (content) AGAINST ('watermelon') AS rel2 FROM table WHERE MATCH (keyword,content) AGAINST ('watermelon') ORDER BY (rel1*1.5)+(rel2) DESC
The point is that rel1
gives you the relevance of your query just in the keyword
column (because you created the index only on that column). rel2
does the same, but for the content
column. You can now add these two relevance scores together applying any weighting you like.
However, you aren't using either of these two indexes for the actual search. For that, you use your third index, which is on both columns.
The index on (keyword,content) controls your recall. Aka, what is returned.
The two separate indexes (one on keyword only, one on content only) control your relevance. And you can apply your own weighting criteria here.
Note that you can use any number of different indexes (or, vary the indexes and weightings you use at query time based on other factors perhaps ... only search on keyword if the query contains a stop word ... decrease the weighting bias for keywords if the query contains more than 3 words ... etc).
Each index does use up disk space, so more indexes, more disk. And in turn, higher memory footprint for mysql. Also, inserts will take longer, as you have more indexes to update.
You should benchmark performance (being careful to turn off the mysql query cache for benchmarking else your results will be skewed) for your situation. This isn't google grade efficient, but it is pretty easy and "out of the box" and it's almost certainly a lot lot better than your use of "like" in the queries.
I find it works really well.
Actually, using a case statement to make a pair of flags might be a better solution:
select ... , case when keyword like '%' + @input + '%' then 1 else 0 end as keywordmatch , case when content like '%' + @input + '%' then 1 else 0 end as contentmatch -- or whatever check you use for the matching from ... and here the rest of your usual matching query ... order by keywordmatch desc, contentmatch desc
Again, this is only if all keyword matches rank higher than all the content-only matches. I also made the assumption that a match in both keyword and content is the highest rank.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With