Which is more efficient of the 2, specifically in the case when using Listagg with varchar type data?
SELECT department_id AS "Dept.",
LISTAGG(last_name, '; ') WITHIN GROUP (ORDER BY hire_date) AS "Employees"
FROM employees
GROUP BY department_id
ORDER BY department_id;
OR
SELECT department_id AS "Dept.",
LISTAGG(last_name, '; ') WITHIN GROUP (ORDER BY hire_date)
OVER (PARTITION BY department_id) AS "Employees"
FROM employees
ORDER BY department_id;
For me, unless I'm only selecting 1/2 columns, I'll always use partition as I don't have to include each column in the 'GROUP BY' clause.
The two queries give different outputs. Using GROUP BY will return one row-per-group whilst using OVER ( PARTITION BY .. ) will return all the rows and duplicate the LISTAGG result for each row in the partition.
Use whichever solution is more appropriate for your desired output - but they are not equivalent.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE employees ( department_id, last_name, hire_date ) AS
SELECT 1, 'AAA', DATE '2018-01-01' FROM DUAL UNION ALL
SELECT 1, 'BBB', DATE '2018-01-02' FROM DUAL UNION ALL
SELECT 1, 'CCC', DATE '2018-01-03' FROM DUAL UNION ALL
SELECT 2, 'DDD', DATE '2018-01-01' FROM DUAL UNION ALL
SELECT 2, 'EEE', DATE '2018-01-02' FROM DUAL;
Query 1:
SELECT department_id AS "Dept.",
LISTAGG(last_name, '; ') WITHIN GROUP (ORDER BY hire_date) AS "Employees"
FROM employees
GROUP BY department_id
ORDER BY department_id
Results:
| Dept. | Employees |
|-------|---------------|
| 1 | AAA; BBB; CCC |
| 2 | DDD; EEE |
Query 2:
SELECT department_id AS "Dept.",
LISTAGG(last_name, '; ') WITHIN GROUP (ORDER BY hire_date)
OVER (PARTITION BY department_id) AS "Employees"
FROM employees
ORDER BY department_id
Results:
| Dept. | Employees |
|-------|---------------|
| 1 | AAA; BBB; CCC |
| 1 | AAA; BBB; CCC |
| 1 | AAA; BBB; CCC |
| 2 | DDD; EEE |
| 2 | DDD; EEE |
Both queries do the same amount of work, as each one has to perform a full scan of EMPLOYEES sorting the values of LAST_NAME within DEPARTMENT_ID.
Using the Oracle 'HR' demo schema (where EMPLOYEES has 107 rows) and the SQL*Plus set autotrace tool, we can see both perform 7 consistent gets and 1 sort.
SQL> set autotrace traceonly explain statistics
SQL> select department_id as "Dept."
2 , listagg(last_name, '; ') within group(order by hire_date) as "Employees"
3 from employees
4 group by department_id
5 order by department_id;
12 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2107619104
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 209 | 4 (25)| 00:00:01 |
| 1 | SORT GROUP BY | | 11 | 209 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMPLOYEES | 107 | 2033 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
1605 bytes sent via SQL*Net to client
608 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
12 rows processed
SQL> select department_id as "Dept."
2 , listagg(last_name, '; ') within group(order by hire_date) over(partition by department_id) as "Employees"
3 from employees
4 order by department_id;
107 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1919783947
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 2033 | 3 (0)| 00:00:01 |
| 1 | WINDOW SORT | | 107 | 2033 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| EMPLOYEES | 107 | 2033 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
3703 bytes sent via SQL*Net to client
685 bytes received via SQL*Net from client
9 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
107 rows processed
There is more network activity for the second query as it returns more rows.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With