Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MYSQL - Group by limit

Is there a simple way to LIMIT the GROUP BY results to the top 2. The following query returns all the results. Using 'LIMIT 2' reduces the overall list to the top 2 entries only.

select distinct(rating_name), 
       id_markets, 
       sum(rating_good) 'good', 
       sum(rating_neutral)'neutral', 
       sum(rating_bad) 'bad' 
 from ratings 
 where rating_year=year(curdate()) and rating_week= week(curdate(),1)
 group by rating_name,id_markets
 order by rating_name, sum(rating_good) 
 desc

Results in the following :-

poland  78 48 24 12   <- keep
poland   1 15  5  0   <- keep
poland  23 12  6  3
poland   2  5  0  0
poland   3  0  5  0
poland   4  0  0  5
ireland  1  9  3  0   <- keep
ireland  2  3  0  0   <- keep
ireland  3  0  3  0
ireland  4  0  0  3
france  12 24 12  6   <- keep
france   1  3  1  0   <- keep
france 231  1  0  0
france   2  1  0  0
france   4  0  0  1
france   3  0  1  0

Thanks Jon


As requested I have attached a copy of the table structure and some test data. My goal is to create a single view that has the top 2 results from each unique rating_name

CREATE TABLE `zzratings` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `id_markets` int(11) DEFAULT NULL,
  `id_account` int(11) DEFAULT NULL,
  `id_users` int(11) DEFAULT NULL,
  `dateTime` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `rating_good` int(11) DEFAULT NULL,
  `rating_neutral` int(11) DEFAULT NULL,
  `rating_bad` int(11) DEFAULT NULL,
  `rating_name` varchar(32) DEFAULT NULL,
  `rating_year` smallint(4) DEFAULT NULL,
  `rating_week` tinyint(4) DEFAULT NULL,
  `cash_balance` decimal(9,6) DEFAULT NULL,
  `cash_spend` decimal(9,6) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `rating_year` (`rating_year`),
  KEY `rating_week` (`rating_week`),
  KEY `rating_name` (`rating_name`)
) ENGINE=MyISAM AUTO_INCREMENT=2166690 DEFAULT CHARSET=latin1;

INSERT INTO `zzratings` (`id`,`id_markets`,`id_account`,`id_users`,`dateTime`,`rating_good`,`rating_neutral`,`rating_bad`,`rating_name`,`rating_year`,`rating_week`,`cash_balance`,`cash_spend`)
VALUES
    (63741, 1, NULL, 100, NULL, 1, NULL, NULL, 'poland', 2010, 15, NULL, NULL),
    (63742, 1, NULL, 101, NULL, 1, NULL, NULL, 'poland', 2010, 15, NULL, NULL),
    (1, 2, NULL, 102, NULL, 1, NULL, NULL, 'poland', 2010, 15, NULL, NULL),
    (63743, 3, NULL, 103, NULL, NULL, 1, NULL, 'poland', 2010, 15, NULL, NULL),
    (63744, 4, NULL, 104, NULL, NULL, NULL, 1, 'poland', 2010, 15, NULL, NULL),
    (63745, 1, NULL, 105, NULL, 1, NULL, NULL, 'poland', 2010, 15, NULL, NULL),
    (63746, 1, NULL, 106, NULL, NULL, 1, NULL, 'poland', 2010, 15, NULL, NULL),
    (63747, 5, NULL, 100, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63748, 5, NULL, 101, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63749, 2, NULL, 102, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63750, 3, NULL, 103, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63751, 4, NULL, 104, NULL, NULL, NULL, 1, 'ireland', 2010, 15, NULL, NULL),
    (63752, 1, NULL, 105, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63753, 1, NULL, 106, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63754, 1, NULL, 100, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63755, 1, NULL, 101, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63756, 2, NULL, 102, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63757, 34, NULL, 103, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63758, 34, NULL, 104, NULL, NULL, NULL, 1, 'ireland', 2010, 15, NULL, NULL),
    (63759, 34, NULL, 105, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63760, 34, NULL, 106, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63761, 21, NULL, 100, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63762, 21, NULL, 101, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63763, 21, NULL, 102, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63764, 21, NULL, 103, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63765, 4, NULL, 104, NULL, NULL, NULL, 1, 'ireland', 2010, 15, NULL, NULL),
    (63766, 1, NULL, 105, NULL, 1, NULL, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63767, 1, NULL, 106, NULL, NULL, 1, NULL, 'ireland', 2010, 15, NULL, NULL),
    (63768, 1, NULL, 100, NULL, 1, NULL, NULL, 'france', 2010, 15, NULL, NULL),
    (63769, 1, NULL, 101, NULL, 1, NULL, NULL, 'france', 2010, 15, NULL, NULL),
    (63770, 2, NULL, 102, NULL, 1, NULL, NULL, 'france', 2010, 15, NULL, NULL),
    (63771, 3, NULL, 103, NULL, NULL, 1, NULL, 'france', 2010, 15, NULL, NULL),
    (63772, 4, NULL, 104, NULL, NULL, NULL, 1, 'france', 2010, 15, NULL, NULL);
like image 471
jono2010 Avatar asked Apr 15 '10 07:04

jono2010


People also ask

Can I use limit with GROUP BY in SQL?

No, you can't LIMIT subqueries arbitrarily (you can do it to a limited extent in newer MySQLs, but not for 5 results per group). This is a groupwise-maximum type query, which is not trivial to do in SQL.

Does limit come after GROUP BY?

The ORDER BY clause goes after the FROM clause but before the LIMIT .

Is there a limit clause in MySQL?

The MySQL LIMIT Clause The LIMIT clause is used to specify the number of records to return. The LIMIT clause is useful on large tables with thousands of records. Returning a large number of records can impact performance.

How do I limit data in MySQL?

Limit Data Selections From a MySQL Database Assume we wish to select all records from 1 - 30 (inclusive) from a table called "Orders". The SQL query would then look like this: $sql = "SELECT * FROM Orders LIMIT 30"; When the SQL query above is run, it will return the first 30 records.


2 Answers

I don't think that there is a simple way in MySQL. One way to do this is by generating a row number for each row partitioned in groups by rating_name, and then only select the rows with row_number 2 or less. In most databases you could do this using something like:

SELECT * FROM (
    SELECT
        rating_name,
        etc...,
        ROW_NUMBER() OVER (PARTITION BY rating_name ORDER BY good) AS rn
    FROM your_table
) T1
WHERE rn <= 2

Unfortunately, MySQL doesn't support the ROW_NUMBER syntax. You can however simulate ROW_NUMBER using variables:

SELECT
    rating_name, id_markets, good, neutral, bad
FROM (
    SELECT
        *,
        @rn := CASE WHEN @prev_rating_name = rating_name THEN @rn + 1 ELSE 1 END AS rn,
        @prev_rating_name := rating_name
    FROM (
        SELECT
            rating_name,
            id_markets,
            SUM(COALESCE(rating_good, 0)) AS good,
            SUM(COALESCE(rating_neutral, 0)) AS neutral,
            SUM(COALESCE(rating_bad, 0)) AS bad
        FROM zzratings
        WHERE rating_year = YEAR(CURDATE()) AND rating_week = WEEK(CURDATE(), 1)
        GROUP BY rating_name, id_markets
    ) AS T1, (SELECT @prev_rating_name := '', @rn := 0) AS vars
    ORDER BY rating_name, good DESC
) AS T2
WHERE rn <= 2
ORDER BY rating_name, good DESC

Result when run on your test data:

france    1  2  0  0
france    2  1  0  0
ireland   1  4  2  0
ireland  21  3  1  0
poland    1  3  1  0
poland    2  1  0  0
like image 120
Mark Byers Avatar answered Oct 17 '22 00:10

Mark Byers


This is still possible via a single query, but it's a bit long, and there are some caveats, which I'll explain after the query. Though, they're not flaws in the query so much as some ambiguity in what "top two" means.

Here's the query:

SELECT ratings.* FROM
(SELECT rating_name, 
       id_markets, 
       sum(rating_good) 'good', 
       sum(rating_neutral)'neutral', 
       sum(rating_bad) 'bad' 
 FROM zzratings 
 WHERE rating_year=year(curdate()) AND rating_week = week(curdate(),1)
 GROUP BY rating_name,id_markets) AS ratings
LEFT JOIN
(SELECT rating_name, 
       id_markets, 
       sum(rating_good) 'good', 
       sum(rating_neutral)'neutral', 
       sum(rating_bad) 'bad' 
 FROM zzratings 
 WHERE rating_year=year(curdate()) AND rating_week= week(curdate(),1)
 GROUP BY rating_name,id_markets) AS ratings2
ON ratings2.good <= ratings.good AND
  ratings2.id_markets <> ratings.id_markets AND
  ratings2.rating_name = ratings.rating_name
LEFT JOIN
(SELECT rating_name, 
       id_markets, 
       sum(rating_good) 'good', 
       sum(rating_neutral)'neutral', 
       sum(rating_bad) 'bad' 
 FROM zzratings 
 WHERE rating_year=year(curdate()) AND rating_week= week(curdate(),1)
 GROUP BY rating_name,id_markets) AS ratings3
ON ratings3.good >= ratings2.good AND
  ratings3.id_markets <> ratings.id_markets AND
  ratings3.id_markets <> ratings2.id_markets AND
  ratings3.rating_name = ratings.rating_name
WHERE (ratings2.good IS NULL OR ratings3.good IS NULL) AND
  ratings.good IS NOT NULL
ORDER BY ratings.rating_name, ratings.good DESC

The caveat is that if there is more than one id_market with the same "good" count for the same rating_name, then you will get more than two records. For example, if there are three ireland id_markets with a "good" count of 3, the highest, then how can you display the top two? You can't. So the query will show all three.

Also, if there were one count of "3", the highest, and two counts of "2", you couldn't show the top two, since you have a tie for second place, so the query shows all three.

The query will be simpler if you create a temporary table with the aggregate result set first, then work from that.

CREATE TEMPORARY TABLE temp_table
  SELECT rating_name, 
           id_markets, 
           sum(rating_good) 'good', 
           sum(rating_neutral)'neutral', 
           sum(rating_bad) 'bad' 
     FROM zzratings 
     WHERE rating_year=year(curdate()) AND rating_week= week(curdate(),1;

SELECT ratings.*
 FROM temp_table ratings
LEFT JOIN temp_table ratings2
ON ratings2.good <= ratings.good AND
  ratings2.id_markets <> ratings.id_markets AND
  ratings2.rating_name = ratings.rating_name
LEFT JOIN temp_table ratings3
ON ratings3.good >= ratings2.good AND
  ratings3.id_markets <> ratings.id_markets AND
  ratings3.id_markets <> ratings2.id_markets AND
  ratings3.rating_name = ratings.rating_name
WHERE (ratings2.good IS NULL OR ratings3.good IS NULL) AND
  ratings.good IS NOT NULL
ORDER BY ratings.rating_name, ratings.good DESC;
like image 34
Marcus Adams Avatar answered Oct 16 '22 23:10

Marcus Adams