Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Limit Per Criteria

I have an articles table and a categories table. I want to fetch 7 articles for each category. Currently I have this but it's terrible slow on large tables so it's not really a solution:

SELECT id, 
       title, 
       categories_id, 
       body, 
       DATE_FORMAT(pubdate, "%d/%m/%y %H:%i") as pubdate   
FROM articles AS t 
WHERE ( 
    SELECT COUNT(*) 
    FROM articles 
    WHERE t.categories_id = categories_id 
      AND id< t.id AND publish = 1 
      AND expires > '2008-12-14 18:38:02' 
      AND pubdate <= '2008-12-14 18:38:02' 
    ) < 7 
ORDER BY categories_id DESC

Using explain, it shows me it's doing a join type ALL & REF. The select types are PRIMARY and DEPENDENT SUBQUERY .

Is there a better solution?

like image 599
scc Avatar asked Nov 28 '25 22:11

scc


2 Answers

Here's how I'd solve this problem:

SELECT a1.id, 
       a1.title, 
       a1.categories_id, 
       a1.body, 
       DATE_FORMAT(a1.pubdate, "%d/%m/%y %H:%i") as pubdate  
FROM articles AS a1
  LEFT OUTER JOIN articles AS a2
  ON (a1.categories_id = a2.categories_id AND 
     (a1.pubdate < a2.pubdate OR (a1.pubdate = a2.pubdate AND a1.id < a2.id)))
GROUP BY a1.id
HAVING COUNT(*) < 7;

A correlated subquery usually performs poorly, so this technique uses a join instead.

For a given article, search for articles (a2) that match the category of the current article under consideration (a1), and have a more recent date (or higher id in the case of a tie). If there are fewer than seven articles that meet that criteria, then the current one must be among the most recent of its category.

If you can rely on the unique id column having the same sort order as pubdate, then you can simplify the join because there will be no ties over a unique column:

  ON (a1.categories_id = a2.categories_id AND a1.id < a2.id)
like image 146
Bill Karwin Avatar answered Nov 30 '25 12:11

Bill Karwin


  1. How big are the tables, and how slow is slow?

  2. What indexes are there on the tables?

  3. What is the entire information from EXPLAIN?

Also, the two datetime values are explicit, so it looks like this is either being generated from code generated by something composing it from other information. Is there another SQL query of some kind, which is executing this within a loop on a list?

It's not clear which 7 articles are being chosen - the most recent? By which date?

like image 43
dkretz Avatar answered Nov 30 '25 12:11

dkretz



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!