Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLITE: merging rows into single row if they share a column

From a previous post, I have the following view in sqlite3:

CREATE View AttendeeTableView AS

SELECT  (LastName || " " || FirstName) as AttendeeName,  
        CompanyName, 
        PhotoURI,
        CompanyAttendeeRelation.CompanyId,
        CompanyAttendeeRelation.AttendeeId 

FROM    Attendee 
JOIN    CompanyAttendeeRelation on CompanyAttendeeRelation.AttendeeId = Attendee.AttendeeId 

ORDER BY LastName;

Now, since the data is generated from a many-to-many relation between Attendee and Company, I can get results such as:

Doe John | company A | johnPic.png | 1 | 15
Doe John | company B | johnPic.png | 2 | 15

What I'd like to do is, in cases where there's more than one company (like above), create a query that outputs:

Doe John | company A company B | johnPic.png | 1 2 | 15

And another that outputs:

Doe John | company A | company B | johnPic.png | 1 | 2 | 15

So I need to know essentially how to merge a specific column for rows that have different values in that table.

Any ideas?

Just in case, company A company B in the first query is obviously text concatenation, That is, something along the lines of (row1.CompanyName || " " || row2.CompanyName)

like image 326
SaldaVonSchwartz Avatar asked Oct 23 '11 16:10

SaldaVonSchwartz


1 Answers

Use the aggregate function group_concat(X) for that:

SELECT (a.LastName || " " || a.FirstName) AS AttendeeName
     , a.PhotoURI
     , group_concat(c.CompanyName) AS Companies
     , group_concat(c.CompanyId)   AS CompanyIds
FROM   Attendee AS a
JOIN   CompanyAttendeeRelation AS ca ON ca.AttendeeId = a.AttendeeId
JOIN   Company                 AS c  ON c.CompanyId = ca.CompanyId
GROUP  BY a.LastName, a.Firstname, a.PhotoURI;

(Using table aliases to make it shorter and easier to read.)

NULL values are excluded from the result. The manual:

the concatenation of all non-NULL values

The order of elements in CompanyIds and Companies is arbitrary, according to the manual:

The order of the concatenated elements is arbitrary.

Also note that "arbitrary" is not the same as "random". group_concat, like other aggregate functions, processes the set of rows in the order received. Without any ORDER BY, that order is dictated by whatever query plan is executed. There is no natural order in tables of relational databases (you cannot rely on insert order at all). But both instances of group_concat() in the same SELECT list process rows in the same order so that the 1st ID in CompanyIds corresponds to the 1st name in Companies.

You can impose your order with ORDER BY in a subquery. It's an implementation detail, but it's highly unlikely to change. Like:

SELECT (LastName || " " || FirstName) AS AttendeeName
     , PhotoURI
     , group_concat(CompanyName) AS Companies
     , group_concat(CompanyId)   AS CompanyIds
FROM  (
   SELECT a.LastName, a.FirstName, a.PhotoURI, c.CompanyName, c.CompanyId
   FROM   Attendee AS a
   JOIN   CompanyAttendeeRelation AS ca ON ca.AttendeeId = a.AttendeeId
   JOIN   Company                 AS c  ON c.CompanyId = ca.CompanyId
   ORDER  BY 1,2,3,4,5  -- or whatever you need
   ) AS sub
GROUP  BY LastName, Firstname, PhotoURI;

The manual about the (optional) ordinal numbers in ORDER BY:

If the ORDER BY expression is a constant integer K then the expression is considered an alias for the K-th column of the result set (columns are numbered from left to right starting with 1).

Use the GROUP BY list as leading ORDER BY expressions for best results.

Don't do anything with the derived table after ordering that might rearrange it (like joining the subquery to another table etc.)

Finally, note that similar aggregate functions in other RDBMS can behave slightly differently. Related:

  • Concatenate multiple result rows of one column into one, group by another column
  • GROUP_CONCAT ORDER BY
like image 55
Erwin Brandstetter Avatar answered Nov 16 '22 23:11

Erwin Brandstetter