Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

LISTAGG in Oracle to return distinct values

I am trying to use the LISTAGG function in Oracle. I would like to get only the distinct values for that column. Is there a way in which I can get only the distinct values without creating a function or a procedure?

   col1  col2 Created_by    1     2     Smith     1     2     John     1     3     Ajay     1     4     Ram     1     5     Jack  

I need to select col1 and the LISTAGG of col2 (column 3 is not considered). When I do that, I get something like this as the result of LISTAGG: [2,2,3,4,5]

I need to remove the duplicate '2' here; I need only the distinct values of col2 against col1.

like image 933
Priyanth Avatar asked Jul 16 '12 19:07

Priyanth


People also ask

Can we use distinct in Listagg in Oracle?

Description The LISTAGG aggregate function now supports duplicate elimination by using the new DISTINCT keyword. The LISTAGG aggregate function orders the rows for each group in a query according to the ORDER BY expression and then concatenates the values into a single string.

Can I use distinct with Listagg?

With the DISTINCT option, the processing to remove duplicate values can be done directly within the LISTAGG function. The result is simpler, faster, more efficient SQL.

What does Listagg return?

An Oracle LISTAGG Function is an aggregate function that returns a single row. This is used to transform data from multiple rows into a single list of values separated by a given delimiter. It operates on all rows and returns single. It returns a comma or other delimiter separatedresult set just like an excel CSV file.

What does Listagg do in Oracle?

The listagg function transforms values from a group of rows into a list of values that are delimited by a configurable separator. Listagg is typically used to denormalize rows into a string of comma-separated values (CSV) or other comparable formats suitable for human reading.


2 Answers

19c and later:

select listagg(distinct the_column, ',') within group (order by the_column) from the_table 

18c and earlier:

select listagg(the_column, ',') within group (order by the_column) from (    select distinct the_column     from the_table ) t 

If you need more columns, something like this might be what you are looking for:

select col1, listagg(col2, ',') within group (order by col2) from (   select col1,           col2,          row_number() over (partition by col1, col2 order by col1) as rn   from foo   order by col1,col2 ) where rn = 1 group by col1; 
like image 125
a_horse_with_no_name Avatar answered Sep 22 '22 09:09

a_horse_with_no_name


From oracle 19C it is built in see here

From 18C and earlier try within group see here

Otherwise use regular expressions

Here's how to solve your issue.

select         regexp_replace(     '2,2,2.1,3,3,3,3,4,4'       ,'([^,]+)(,\1)*(,|$)', '\1\3')  from dual 

returns

2,2.1,3,4

ANSWER below:

select col1,   regexp_replace(     listagg(      col2 , ',') within group (order by col2)  -- sorted     ,'([^,]+)(,\1)*(,|$)', '\1\3') )    from tableX where rn = 1 group by col1;  

Note: The above will work in most cases - list should be sorted , you may have to trim all trailing and leading space depending on your data.

If you have a alot of items in a group > 20 or big string sizes you might run into oracle string size limit 'result of string concatenation is too long'.

From oracle 12cR2 you can suppress this error see here. Alternatively put a max number on the members in each group. This will only work if its ok to list only the first members. If you have very long variable strings this may not work. you will have to experiment.

select col1,  case      when count(col2) < 100 then         regexp_replace(         listagg(col2, ',') within group (order by col2)         ,'([^,]+)(,\1)*(,|$)', '\1\3')       else     'Too many entries to list...' end      from sometable where rn = 1 group by col1; 

Another solution (not so simple) to hopefully avoid oracle string size limit - string size is limited to 4000. Thanks to this post here by user3465996

select col1  ,     dbms_xmlgen.convert(  -- HTML decode     dbms_lob.substr( -- limit size to 4000 chars     ltrim( -- remove leading commas     REGEXP_REPLACE(REPLACE(          REPLACE(            XMLAGG(              XMLELEMENT("A",col2 )                ORDER BY col2).getClobVal(),              '<A>',','),              '</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'),                   ','), -- remove leading XML commas ltrim                       4000,1) -- limit to 4000 string size                       , 1)  -- HTML.decode                        as col2  from sometable where rn = 1 group by col1; 

V1 - some test cases - FYI

regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1') -> 2.1,3,4 Fail regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1') -> 2 ,2.1,3,4 Success  - fixed length items 

V2 -items contained within items eg. 2,21

regexp_replace('2.1,1','([^,]+)(,\1)+', '\1') -> 2.1 Fail regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2') -> 2 ,2.1,1 ,3 ,4  -- success - NEW regex  regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2') -> a,b,b,c fail! 

v3 - regex thank Igor! works all cases.

select   regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') , ---> 2,2.1,3,4 works regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'), --> 2.1,1 works regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3') ---> a,b,c works  from dual 
like image 36
ozmike Avatar answered Sep 22 '22 09:09

ozmike