I'm making an in webpage cache system. I wanted to make a simple page rank system along with output. The problem is, I want to display the recordset with the highest relevance score per unique domain. One domain may have multiple records but with different titles, descriptions, etc. The problem is, instead of getting 1 recordset containing a unique domain, it groups all the recordsets of that unique domain and outputs them all. I just want the recordset with the highest relevance score per unique domain per group before it outputs the next (and different domain with the highest relevance for that group)
SELECT title, html, sum(relevance) FROM
(
SELECT title, html, 10 AS relevance FROM page WHERE title like ‘%about%’ UNION
SELECT title, html, 7 AS relevance FROM page WHERE html like ‘%about%’ UNION
SELECT title, html, 5 AS relevance FROM page WHERE keywords like ‘%about%’ UNION
SELECT title, html, 2 AS relevance FROM page WHERE description like ‘%about%’
) results
GROUP BY title, html
ORDER BY relevance desc;
I'm getting:
domain1 title html
domain1 title html
domain1 title html
domain2 title html
domain2 title html
domain2 title html
What I want is
domain1 title html
domain2 title html
domain3 title html
domain4 title html
domain5 title html
Using GROUP BY with LIMITIf you don't group by any columns, you'll get a 1-row result—no problem there. If you group by a column with enough unique values that it exceeds the LIMIT number, the aggregates will be calculated, and then some rows will simply be omitted from the results.
The GROUP BY Clause SQL is used to group rows with same values. The GROUP BY Clause is used together with the SQL SELECT statement. The SELECT statement used in the GROUP BY clause can only be used contain column names, aggregate functions, constants and expressions.
LIMIT Clause with ORDER BY Clause So, it is a good habit to use Group By clause with a Limit clause for getting the rows in a specific order. The following syntax can be used to get the result in a unique order: SELECT column_list. FROM table_name.
MAX(x) - 1 simply means the max value of x in the table minus one. You can always use parenthesis and aliases ( as some_cool_name ) to make thing clearer, or to change names in the result. But the first syntax is perfectly valid.
I'm not sure why your code even works, since I think you should have
ORDER BY Sum(relevance) DESC
instead of
ORDER BY relevance DESC
Maybe that's the problem?
Beyond that, what about this. It is ugly, but it will work. It would be better if SQL Server understood how to refer to aliases later in the query. But alas.
SELECT title, html, Case When title LIKE '%about%' Then 10 Else 0 End + Case When html LIKE '%about%' Then 7 Else 0 End + Case When keywords LIKE '%about%' Then 5 Else 0 End + Case When description LIKE '%about%' Then 2 Else 0 End AS relevance FROM page WHERE Case When title LIKE '%about%' Then 10 Else 0 End + Case When html LIKE '%about%' Then 7 Else 0 End + Case When keywords LIKE '%about%' Then 5 Else 0 End + Case When description LIKE '%about%' Then 2 Else 0 End > 0 ORDER BY Case When title LIKE '%about%' Then 10 Else 0 End + Case When html LIKE '%about%' Then 7 Else 0 End + Case When keywords LIKE '%about%' Then 5 Else 0 End + Case When description LIKE '%about%' Then 2 Else 0 End DESC;
Or maybe just a slight rearrangement:
SELECT title, html, relevance FROM (SELECT title, html, Case When title LIKE '%about%' Then 10 Else 0 End + Case When html LIKE '%about%' Then 7 Else 0 End + Case When keywords LIKE '%about%' Then 5 Else 0 End + Case When description LIKE '%about%' Then 2 Else 0 End AS relevance FROM page) WHERE relevance > 0 ORDER BY relevance DESC;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With