Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count distinct multiple columns in redshift

I am trying to count rows which have a distinct combination of 2 columns in Amazon redshift. The query I am using is -

select count(distinct col1, col2)
from schemaname.tablename
where some filters

It is throwing me this error -

Amazon Invalid operation: function count(character varying, bigint) does not exist`

I tried casting bigint to char but it didn't work.

like image 420
Janusz01 Avatar asked Sep 24 '18 05:09

Janusz01


3 Answers

you can use sub-query and count

select count(*) from (
  select distinct col1, col2 
 from schemaname.tablename
  where some filter
) as t
like image 174
Zaynul Abadin Tuhin Avatar answered Sep 20 '22 14:09

Zaynul Abadin Tuhin


A little late to the party but anyway: you can also try to concatenate columns using || operator. It might be inefficient so I wouldn't use it in prod code, but for ad-hoc analysis should be fine.

select count(distinct col1 || '_' || col2)
from schemaname.tablename
where some filters

Note separator choice might matter, i.e. both 'foo' || '_' || 'bar_baz' and 'foo_bar' || '_' || 'baz' yield 'foo_bar_baz' and are thus equal. In some cases this might be concern, in some it's so insignificant you can skip separator completely.

like image 36
Mariusz Sakowski Avatar answered Sep 22 '22 14:09

Mariusz Sakowski


You can use

select col1,col2,count(*) from schemaname.tablename
where -- your filter
group by col1,col2
like image 38
Deepak Avatar answered Sep 20 '22 14:09

Deepak