Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySql second-smallest element in each group

I have a table similar to the following:

    date    |   expiry
-------------------------    
2010-01-01  | 2010-02-01
2010-01-01  | 2010-03-02
2010-01-01  | 2010-04-04
2010-02-01  | 2010-03-01
2010-02-01  | 2010-04-02

In the table, each date may have multiple 'expiry' values. I need a query that returns the n-th smallest expiry in each date. For example, for n = 2, I would expect:

     date    |   expiry
-------------------------       
2010-01-01  | 2010-03-02
2010-02-01  | 2010-04-02

My trouble is that AFAIK, there is no aggregate function which returns the n'th largest/smallest element, so I can't use 'GROUP BY'. More specifically, if I had a magical MIN() aggregate that accepts a second parameter 'offset', I would write:

SELECT MIN(expiry, 1) FROM table WHERE date IN ('2010-01-01', '2010-02-01') GROUP BY date

Any suggestions?

like image 965
bavaza Avatar asked Jan 26 '11 20:01

bavaza


1 Answers

One hack is to use group_concat. Group by the date and concat the expiry date in ascending order and use substring_index function to fetch the nth value.

mysql> select * from expiry;
+------------+------------+
| date       | expiry     |
+------------+------------+
| 2010-01-01 | 2010-02-01 |
| 2010-01-01 | 2010-03-02 |
| 2010-01-01 | 2010-04-04 |
| 2010-02-01 | 2010-03-01 |
| 2010-02-01 | 2010-04-02 |
+------------+------------+
5 rows in set (0.00 sec)

mysql> SELECT mdate,
       Substring_index(Substring_index(edate, ',', 2), ',', -1) AS exp_date
FROM   (SELECT `date`               AS mdate,
               GROUP_CONCAT(expiry order by expiry asc separator ",") AS edate
        FROM   expiry
        GROUP  BY mdate) e1;  
+------------+------------+
| mdate      | exp_date   |
+------------+------------+
| 2010-01-01 | 2010-03-02 |
| 2010-02-01 | 2010-04-02 |
+------------+------------+
2 rows in set (0.00 sec)

In the example here the sub-query gives the following output:

+------------+----------------------------------+
| mdate      | edate                            |
+------------+----------------------------------+
| 2010-01-01 | 2010-02-01,2010-03-02,2010-04-04 |
| 2010-02-01 | 2010-03-01,2010-04-02            |
+------------+----------------------------------+

substring_index(edate,',',2) goes 2 elements forward (for nth element substitute 2 by n).

+------------+------------------------------+
| mdate      | substring_index(edate,',',2) |
+------------+------------------------------+
| 2010-01-01 | 2010-02-01,2010-03-02        |
| 2010-02-01 | 2010-03-01,2010-04-02        |
+------------+------------------------------+

we run another substring_index on the above output to get only the 2nd element (the last element of the intermediate result) using substring_index(substring_index(edate,',',2),',',-1)

+------------+------------------------------------------------------+
| mdate      | substring_index(substring_index(edate,',',2),',',-1) |
+------------+------------------------------------------------------+
| 2010-01-01 | 2010-03-02                                           |
| 2010-02-01 | 2010-04-02                                           |
+------------+------------------------------------------------------+

If there are too many values to concat you might run out of group_concat_max_len value (default 1024, but can be set higher).

UPDATE: The SQL given above will give nth element even when there is less n elements for tht group. To avoid that the sql can be modified as:

SELECT mdate,
       IF(cnt >= 2,Substring_index(Substring_index(edate, ',', 2), ',', -1),NULL) AS exp_date
FROM   (SELECT `date`               AS mdate,
               count(expiry) as cnt,
               GROUP_CONCAT(expiry order by expiry asc separator ",") AS edate
        FROM   expiry
        GROUP  BY mdate) e1;  
like image 180
Damodharan R Avatar answered Sep 28 '22 02:09

Damodharan R