I am trying to use the LISTAGG
function in Oracle. I would like to get only the distinct values for that column. Is there a way in which I can get only the distinct values without creating a function or a procedure?
col1 col2 Created_by 1 2 Smith 1 2 John 1 3 Ajay 1 4 Ram 1 5 Jack
I need to select col1 and the LISTAGG
of col2 (column 3 is not considered). When I do that, I get something like this as the result of LISTAGG
: [2,2,3,4,5]
I need to remove the duplicate '2' here; I need only the distinct values of col2 against col1.
Description The LISTAGG aggregate function now supports duplicate elimination by using the new DISTINCT keyword. The LISTAGG aggregate function orders the rows for each group in a query according to the ORDER BY expression and then concatenates the values into a single string.
With the DISTINCT option, the processing to remove duplicate values can be done directly within the LISTAGG function. The result is simpler, faster, more efficient SQL.
An Oracle LISTAGG Function is an aggregate function that returns a single row. This is used to transform data from multiple rows into a single list of values separated by a given delimiter. It operates on all rows and returns single. It returns a comma or other delimiter separatedresult set just like an excel CSV file.
The listagg function transforms values from a group of rows into a list of values that are delimited by a configurable separator. Listagg is typically used to denormalize rows into a string of comma-separated values (CSV) or other comparable formats suitable for human reading.
19c and later:
select listagg(distinct the_column, ',') within group (order by the_column) from the_table
18c and earlier:
select listagg(the_column, ',') within group (order by the_column) from ( select distinct the_column from the_table ) t
If you need more columns, something like this might be what you are looking for:
select col1, listagg(col2, ',') within group (order by col2) from ( select col1, col2, row_number() over (partition by col1, col2 order by col1) as rn from foo order by col1,col2 ) where rn = 1 group by col1;
From oracle 19C it is built in see here
From 18C and earlier try within group see here
Otherwise use regular expressions
Here's how to solve your issue.
select regexp_replace( '2,2,2.1,3,3,3,3,4,4' ,'([^,]+)(,\1)*(,|$)', '\1\3') from dual
returns
2,2.1,3,4
ANSWER below:
select col1, regexp_replace( listagg( col2 , ',') within group (order by col2) -- sorted ,'([^,]+)(,\1)*(,|$)', '\1\3') ) from tableX where rn = 1 group by col1;
Note: The above will work in most cases - list should be sorted , you may have to trim all trailing and leading space depending on your data.
If you have a alot of items in a group > 20 or big string sizes you might run into oracle string size limit 'result of string concatenation is too long'.
From oracle 12cR2 you can suppress this error see here. Alternatively put a max number on the members in each group. This will only work if its ok to list only the first members. If you have very long variable strings this may not work. you will have to experiment.
select col1, case when count(col2) < 100 then regexp_replace( listagg(col2, ',') within group (order by col2) ,'([^,]+)(,\1)*(,|$)', '\1\3') else 'Too many entries to list...' end from sometable where rn = 1 group by col1;
Another solution (not so simple) to hopefully avoid oracle string size limit - string size is limited to 4000. Thanks to this post here by user3465996
select col1 , dbms_xmlgen.convert( -- HTML decode dbms_lob.substr( -- limit size to 4000 chars ltrim( -- remove leading commas REGEXP_REPLACE(REPLACE( REPLACE( XMLAGG( XMLELEMENT("A",col2 ) ORDER BY col2).getClobVal(), '<A>',','), '</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'), ','), -- remove leading XML commas ltrim 4000,1) -- limit to 4000 string size , 1) -- HTML.decode as col2 from sometable where rn = 1 group by col1;
V1 - some test cases - FYI
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1') -> 2.1,3,4 Fail regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1') -> 2 ,2.1,3,4 Success - fixed length items
V2 -items contained within items eg. 2,21
regexp_replace('2.1,1','([^,]+)(,\1)+', '\1') -> 2.1 Fail regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2') -> 2 ,2.1,1 ,3 ,4 -- success - NEW regex regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2') -> a,b,b,c fail!
v3 - regex thank Igor! works all cases.
select regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') , ---> 2,2.1,3,4 works regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'), --> 2.1,1 works regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3') ---> a,b,c works from dual
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With