Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select rows with Max Value grouped by two columns

I have seen quite a few solutions on this kind of problem (esp. this one SQL Select only rows with Max Value on a Column), but none of these seem to be appropriate:

I have the following table layout, a versioning of attachments, which are bound to entities:

TABLE attachments
+------+--------------+----------+----------------+---------------+
| id   | entitiy_id   | group_id | version_number | filename      |
+------+--------------+----------+----------------+---------------+
| 1    | 1            | 1        | 1              | file1-1.pdf   |
| 2    | 1            | 1        | 2              | file1-2.pdf   |
| 3    | 1            | 2        | 1              | file2-1.pdf   |
| 4    | 2            | 1        | 1              | file1-1.pdf   |
| 5    | 2            | 1        | 2              | file1-2.pdf   |
| 6    | 2            | 3        | 1              | file3-1.pdf   |
+------+--------------+----------+----------------+---------------+

Output should be Max version number, grouped by group_id and entity_id, I'd only need a list for single entity_ids if that helps:

+------+--------------+----------+----------------+---------------+
| id   | entitiy_id   | group_id | version_number | filename      |
+------+--------------+----------+----------------+---------------+
| 2    | 1            | 1        | 2              | file1-2.pdf   |
| 3    | 1            | 2        | 1              | file2-1.pdf   |
| 5    | 2            | 1        | 2              | file1-2.pdf   |
| 6    | 2            | 3        | 1              | file3-1.pdf   |
+------+--------------+----------+----------------+---------------+

What I have come up with is this self join one:

SELECT *
FROM   `attachments` `attachments`
       LEFT OUTER JOIN attachments t2
         ON ( attachments.group_id = t2.group_id
              AND attachments.version_number < t2.version_number )
WHERE  ( t2.group_id IS NULL )
   AND ( `t2`.`id` = 1 )
GROUP  BY t2.group_id

But this one only works if different entities do not share same group numbers. This, unfortunately is necessary.

I came across a working solution while creating a view, but this is not supported in my current setup.

Any ideas are highly appreciated. Thanks!

like image 492
Simon Brückner Avatar asked Feb 20 '12 11:02

Simon Brückner


3 Answers

Try this:

select t1.* from attachments t1
left join attachments t2
on t1.entity_id = t2.entity_id and t1.group_id = t2.group_id and
   t1.version_number < t2.version_number
where t2.version_number is null
like image 94
Mosty Mostacho Avatar answered Oct 18 '22 17:10

Mosty Mostacho


This would work for selecting all

SELECT attachments.*
FROM (
    SELECT entitiy_id, group_id, MAX(version_number) AS max_version
    FROM attachments
    GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version

Expanding this to just look for a single entitiy_id would simply involve adding a WHERE into the subquery, so this would give

SELECT attachments.*
FROM (
    SELECT entitiy_id, group_id, MAX(version_number) AS max_version
    FROM attachments
    WHERE entitity_id = [[YOUR ENTITIY ID HERE]]
    GROUP BY entitiy_id, group_id
) AS maxVersions
INNER JOIN attachments
ON attachments.entitiy_id = maxVersions.entitiy_id
AND attachments.group_id = maxVersions.group_id
AND attachments.version_number = maxVersions.max_version

If you wish to make sure that this continues to operate fast as the number of rows increases, I would advise you make sure you add a key onto attachments with the rows (entitiy_id, group_id, max_version) as then the subquery would be able to rely on that thus ensuring it doesn't lock up the table.

like image 2
Simon at My School Portal Avatar answered Oct 18 '22 16:10

Simon at My School Portal


This would do the trick:

select a1.* from attachments a1
inner join ( select entitiy_id, group_id, max(version_number) as version_number
             from attachments
             group by entitiy_id, group_id) a2 on a1.entitiy_id = a2.entitiy_id and
                                                  a1.group_id = a2.group_id and
                                                  a1.version_number = a2.version_number
like image 2
aF. Avatar answered Oct 18 '22 17:10

aF.