Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GROUP BY only primary key, but select other values

Is there a way to group by a unique (primary) key, essentially giving an implicit guarantee that the other columns from that table will be well-defined?

SELECT myPrimaryKey, otherThing
FROM myTable
GROUP BY myPrimaryKey

I know that I can add the other columns to the statement (GROUP BY myPrimaryKey,otherThing), but I'm trying to avoid that. If you're curious why, read on:


I have a statement which is essentially doing this:

SELECT nodes.node_id, nodes.node_label, COUNT(1)
FROM {a couple of joined tables}
INNER JOIN nodes USING (node_id)
GROUP BY nodes.node_id, nodes.node_label

which works fine, but is a bit slow in MySQL. If I remove nodes.node_label from the GROUP BY, it runs about 10x faster (according to EXPLAIN, this is because one of the earlier joins starts using an index when previously it didn't).

We're in the process of migrating to Postgres, so all new statements are supposed to be compatible with both MySQL and Postgres when possible. Now in Postgres, the original statement runs fast, but the new statement (with the reduced group by) won't run (because Postgres is stricter). In this case, it's a false error because the statement is actually well-defined.

Is there a syntax I can use which will let the same statement run in both platforms, while letting MySQL use just one column in the group by for speed?

like image 442
Dave Avatar asked Jun 05 '14 14:06

Dave


People also ask

Can we do GROUP BY on primary key?

Essentially this means grouping by the primary key of a table results in no change in rows to that table, therefore if we group by the primary key of a table, we can call on all columns of that table with no aggregate function.

Can I use SELECT * with GROUP BY?

You can use a SELECT command with a GROUP BY clause to group all rows that have identical values in a specified column or combination of columns, into a single row.

Can we SELECT column which is not part of GROUP BY?

You can not select aggregates across a field if you don't include the field in the group by list.

Can you GROUP BY something not in SELECT?

Answer. No, you can GROUP BY a column that was not included in the SELECT statement. For example, this query does not list the price column in the SELECT , but it does group the data by that column.


2 Answers

In more recent versions of MySql you might have sql_mode=only_full_group_by enabled which doesn't allow to select non-aggregated columns when using group by i.e. it forces you to use a function like max() or avg() or group_concat(), sometimes you just want any value.

This flag is enabled by default in MySql 5.7.

The function any_value() is available when that flag is enabled.

You can achieve the same effect without disabling ONLY_FULL_GROUP_BY by using ANY_VALUE() to refer to the nonaggregated column.

select t.index, any_value(t.insert_date)
from my_table t
group by t.index;

More information here: https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sqlmode_only_full_group_by and here: https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html

like image 99
santiago arizti Avatar answered Oct 08 '22 17:10

santiago arizti


In Postgres (not in MySQL, though), you could use DISTINCT ON to pick a single, consistent row per value (or group of values) without aggregating them:

SELECT DISTINCT ON (n.node_id)
       *                 -- select any or all columns of all joined tables
FROM   {a couple of joined tables}
JOIN   nodes n USING (node_id)

That gives you a single, arbitrary row for each node_id. to pick a specific row, add:

ORDER  BY n.node_id, ... -- what to sort first?

.. add more ORDER BY items to pick a specific row. Details:
Select first row in each GROUP BY group?

like image 32
Erwin Brandstetter Avatar answered Oct 08 '22 17:10

Erwin Brandstetter