Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using group by on multiple columns

I understand the point of GROUP BY x.

But how does GROUP BY x, y work, and what does it mean?

like image 437
Alex Gordon Avatar asked Mar 10 '10 23:03

Alex Gordon


People also ask

Can GROUP BY be used with multiple columns?

The SQL GROUP BY clause is used along with some aggregate functions to group columns that have the same values in different rows. The group by multiple columns technique is used to retrieve grouped column values from one or more tables of the database by considering more than one column as grouping criteria.

Can we GROUP BY 2 columns in SQL?

SELECT Statement: The GROUP BY Clause in SQLA GROUP BY clause can contain two or more columns—or, in other words, a grouping can consist of two or more columns.

How do you group columns in SQL?

The SQL GROUP BY clause allows us to group individual data based on defined criteria. You can group individual data by one or more table columns. In order to do the grouping properly, you often need to apply aggregate functions to the column(s) within the SQL SELECT statement.

Does GROUP BY need all columns?

If you specify the GROUP BY clause, columns referenced must be all the columns in the SELECT clause that do not contain an aggregate function. These columns can either be the column, an expression, or the ordinal number in the column list.


1 Answers

Group By X means put all those with the same value for X in the one group.

Group By X, Y means put all those with the same values for both X and Y in the one group.

To illustrate using an example, let's say we have the following table, to do with who is attending what subject at a university:

Table: Subject_Selection  +---------+----------+----------+ | Subject | Semester | Attendee | +---------+----------+----------+ | ITB001  |        1 | John     | | ITB001  |        1 | Bob      | | ITB001  |        1 | Mickey   | | ITB001  |        2 | Jenny    | | ITB001  |        2 | James    | | MKB114  |        1 | John     | | MKB114  |        1 | Erica    | +---------+----------+----------+ 

When you use a group by on the subject column only; say:

select Subject, Count(*) from Subject_Selection group by Subject 

You will get something like:

+---------+-------+ | Subject | Count | +---------+-------+ | ITB001  |     5 | | MKB114  |     2 | +---------+-------+ 

...because there are 5 entries for ITB001, and 2 for MKB114

If we were to group by two columns:

select Subject, Semester, Count(*) from Subject_Selection group by Subject, Semester 

we would get this:

+---------+----------+-------+ | Subject | Semester | Count | +---------+----------+-------+ | ITB001  |        1 |     3 | | ITB001  |        2 |     2 | | MKB114  |        1 |     2 | +---------+----------+-------+ 

This is because, when we group by two columns, it is saying "Group them so that all of those with the same Subject and Semester are in the same group, and then calculate all the aggregate functions (Count, Sum, Average, etc.) for each of those groups". In this example, this is demonstrated by the fact that, when we count them, there are three people doing ITB001 in semester 1, and two doing it in semester 2. Both of the people doing MKB114 are in semester 1, so there is no row for semester 2 (no data fits into the group "MKB114, Semester 2")

Hopefully that makes sense.

like image 145
Smashery Avatar answered Oct 18 '22 10:10

Smashery