Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL sort by number of occurrences

I am doing a search in two text fields called Subject and Text for a specific keyword. To do this I use the LIKE statement. I have encountered a problem when trying to sort the results by the number of occurrences.

my search query looks like this:

SELECT * FROM Table WHERE (Text LIKE '%Keyword%' OR Subject LIKE '%Keyword%')

I tried to add a count() statement and sort it by the number of occurrences, but the count() statement just keep returning the number of rows in my table.

Here is the query with count statement:

SELECT *, COUNT(Text LIKE '%Keyword%') AS cnt FROM News WHERE (Text LIKE '%Keyword%' OR Subject LIKE '%Keyword%') ORDER BY cnt

What im looking for is something that returns the number of matches on the Subject and Text columns on each row, and then order the result after the highest amount of occurrences of the keyword on each row.

like image 937
Marcus Hanikat Avatar asked Mar 02 '14 20:03

Marcus Hanikat


2 Answers

Below query can give you the no.of occurrences of string appears in both columns i.e text and subject and will sort results by the criteria but this will not be a good solution performance wise its better to sort the results in your application code level

SELECT *,
(LENGTH(`Text`) - LENGTH(REPLACE(`Text`, 'Keyword', ''))) / LENGTH('Keyword')
+
(LENGTH(`Subject`) - LENGTH(REPLACE(`Subject`, 'Keyword', ''))) / LENGTH('Keyword') `occurences`
 FROM 
`Table`
 WHERE (Text LIKE '%Keyword%' OR Subject LIKE '%Keyword%')
ORDER BY `occurences`  DESC

Fiddle Demo

Suggested by @lserni a more cleaner way of calculation of occurrences

SELECT *,
(LENGTH(`Text`) - LENGTH(REPLACE(`Text`, 'test', ''))) / LENGTH('test') `appears_in_text`,

(LENGTH(`Subject`) - LENGTH(REPLACE(`Subject`, 'test', ''))) / LENGTH('test') `appears_in_subject`,

(LENGTH(CONCAT(`Text`,' ',`Subject`)) - LENGTH(REPLACE(CONCAT(`Text`,' ',`Subject`), 'test', ''))) / LENGTH('test') `occurences`
 FROM 
`Table1`
 WHERE (TEXT LIKE '%test%' OR SUBJECT LIKE '%test%')
ORDER BY `occurences`  DESC

Fiddle Demo 2

like image 150
M Khalid Junaid Avatar answered Oct 14 '22 07:10

M Khalid Junaid


You want SUM instead. Count will count how many records have non-null values, which means ALL matches and NON-matches will be counted.

SELECT *, SUM(Text LIKE '%Keyword') AS total_matches
...
ORDER BY total_matches

SUM() will count up how many boolean true results the LIKE produces, which will be typecast to integers, so you get a result like 1+1+1+0+1 = 4, instead of the 5 non-nulls count.

like image 31
Marc B Avatar answered Oct 14 '22 07:10

Marc B