Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sqlite group_concat ordering

In Sqlite I can use group_concat to do:

1...A 1...B 1...C 2...A 2...B 2...C  1...C,B,A 2...C,B,A 

but the order of the concatenation is random - according to docs.

I need to sort the output of group_concat to be

1...A,B,C 2...A,B,C 

How can I do this?

like image 369
user230781 Avatar asked Dec 13 '09 18:12

user230781


2 Answers

Can you not use a subselect with the order by clause in, and then group concat the values?

Something like

SELECT ID, GROUP_CONCAT(Val) FROM (    SELECT ID, Val    FROM YourTable    ORDER BY ID, Val    ) GROUP BY ID; 
like image 72
Adriaan Stander Avatar answered Oct 11 '22 23:10

Adriaan Stander


To be more precise, according to the docs:

The order of the concatenated elements is arbitrary.

It does not really mean random, it just means that the developers reserve the right to use whatever ordering they whish, even different ones for different queries or in different SQLite versions.

With the current version, this ordering might be the one implied by Adrian Stander's answer, as his code does seem to work. So you might just guard yourself with some unit tests and call it a day. But without examining the source code of SQLite really closely you can never be 100% sure this will always work.

If you are willing to build SQLite from source, you can also try to write your own user-defined aggregate function, but there is an easier way.

Fortunately, since version 3.25.0, you have window functions, providing a guaranteed-to-work, although somewhat ugly solution to your problem.

As you can see in the documentation, window functions have their own ORDER BY clauses:

In the example above, the window frame consists of all rows between the previous row ("1 PRECEDING") and the following row ("1 FOLLOWING"), inclusive, where rows are sorted according to the ORDER BY clause in the window-defn (in this case "ORDER BY a").

Note, that this alone would not necessarily mean that all aggregate functions respect the ordering inside a window frame, but if you take a look at the unit tests, you can see this is actually the case:

do_execsql_test 4.10.1 {   SELECT a,      count() OVER (ORDER BY a DESC),     group_concat(a, '.') OVER (ORDER BY a DESC)    FROM t2 ORDER BY a DESC } {   6 1 6   5 2 6.5   4 3 6.5.4   3 4 6.5.4.3   2 5 6.5.4.3.2   1 6 6.5.4.3.2.1   0 7 6.5.4.3.2.1.0 } 

So, to sum it up, you can write

SELECT ID, GROUP_CONCAT(Val) OVER (PARTITION BY ID ORDER BY Val) FROM YourTable; 

resulting in:

1|A 1|A,B 1|A,B,C 2|A 2|A,B 2|A,B,C 

Which unfortunately also contains every prefix of your desired aggregations. Instead you want to specify the window frames to always contain the full range, then discard the redundant values, like this:

SELECT DISTINCT ID, GROUP_CONCAT(Val) OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM YourTable; 

or like this:

SELECT * FROM (     SELECT ID, GROUP_CONCAT(Val)     OVER (PARTITION BY ID ORDER BY Val ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)     FROM YourTable ) GROUP BY ID; 
like image 25
szmate1618 Avatar answered Oct 11 '22 23:10

szmate1618