Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mysql SORT BY amount of unique word matches

I've found many questions that ask for amount of appearences, but none that ask the very same as I wish to do.

A dynamically generated (prepared-statement) query will result in something like this:

SELECT * FROM products WHERE 
( title LIKE ? AND title LIKE ? ) AND 
( content LIKE ? OR content LIKE ? ) AND 
( subtitle LIKE ? AND author LIKE ? )
ORDER BY relevance LIMIT ?,?

The amount of words entered, (and so the amount of LIKE) are for title,content and author a variable amount (depending on the search query).

Now I've added a ORDER BY relevance. But I wish this order to be the amount of unique words from the content-field that match. (Note: Not on the amount of appearences, but on the amount of entered strings in the content column that have at least one match).

Example table products:

id | title   | subtitle  | content
------------------------------------
1  | animals | cat       | swim swim swim swim swim swim swim
2  | canimal | fish      | some content
3  | food    | roasted   | some content
4  | animal  | cat       | swim better better swims better something else
5  | animal  | cat       | dogs swim better

Example query (with prepared statements ? filled in):

SELECT * FROM products WHERE 
( title LIKE %animal% ) AND 
( content LIKE %dog% OR content LIKE %swim% OR content LIKE %better% ) AND 
( subtitle LIKE %cat% )
ORDER BY relevance LIMIT 0,10

Expected results (in correct order!):

id      | amount of matches
-----------------
5       | 3 (dog, swim, better)
4       | 2 (swim, better)
1       | 1 (swim)

I have an Innodb table and mysql version lower than 5.6, therefore I can't use MATCH...AGAINST. I was thinking this could be solved with WHEN CASE ... THEN. But I have no idea how I could create this sorting.

like image 243
Pepe Avatar asked Oct 23 '13 11:10

Pepe


2 Answers

You can do it in many ways for example

ORDER BY SIGN(LOCATE('dog',content))+
         SIGN(LOCATE('swim',content))+
         SIGN(LOCATE('better',content)) DESC

SQLFiddle demo

or with CASE

ORDER BY 
CASE WHEN content LIKE '%dog%' 
        THEN 1
        ELSE 0
END
+
CASE WHEN content LIKE '%swim%' 
        THEN 1
        ELSE 0
END
+
CASE WHEN content LIKE '%better%' 
        THEN 1
        ELSE 0
END

DESC
like image 114
valex Avatar answered Oct 14 '22 10:10

valex


Check like this.

    SELECT id,CONCAT_WS('-',COUNT(LENGTH(content) - LENGTH(REPLACE(content, ' ', '')) + 1),REPLACE(content,' ',',')) AS amount of matches FROM products 
        WHERE 
        ( title LIKE %animal% ) AND 
        ( content LIKE %dog% OR content LIKE %swim% OR content LIKE %better% ) AND 
        ( subtitle LIKE %cat% )
        GROUP BY id
ORDER BY id
like image 27
Sanal K Avatar answered Oct 14 '22 09:10

Sanal K