Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SqlAlchemy: count of distinct over multiple columns

Tags:

I can't do:

>>> session.query(         func.count(distinct(Hit.ip_address, Hit.user_agent)).first() TypeError: distinct() takes exactly 1 argument (2 given) 

I can do:

session.query(         func.count(distinct(func.concat(Hit.ip_address, Hit.user_agent))).first() 

Which is fine (count of unique users in a 'pageload' db table).

This isn't correct in the general case, e.g. will give a count of 1 instead of 2 for the following table:

 col_a | col_b ----------------   xx   |  yy   xxy  |  y 

Is there any way to generate the following SQL (which is valid in postgresql at least)?

SELECT count(distinct (col_a, col_b)) FROM my_table; 
like image 531
EoghanM Avatar asked May 24 '13 06:05

EoghanM


People also ask

How do I count the number of distinct values of multiple columns in SQL?

To get unique or distinct values of a column in MySQL Table, use the following SQL Query. SELECT DISTINCT(column_name) FROM your_table_name; You can select distinct values for one or more columns.

Can I use distinct with multiple columns?

We can use the DISTINCT clause on more than columns in MySQL. In this case, the uniqueness of rows in the result set would depend on the combination of all columns.

How do I get distinct count of all columns in SQL?

To count the number of different values that are stored in a given column, you simply need to designate the column you pass in to the COUNT function as DISTINCT . When given a column, COUNT returns the number of values in that column. Combining this with DISTINCT returns only the number of unique (and non-NULL) values.

Can we use two columns in count in SQL?

You can use CASE statement to count two different columns in a single query. To understand the concept, let us first create a table. The query to create a table is as follows.


2 Answers

distinct() accepts more than one argument when appended to the query object:

session.query(Hit).distinct(Hit.ip_address, Hit.user_agent).count() 

It should generate something like:

SELECT count(*) AS count_1 FROM (SELECT DISTINCT ON (hit.ip_address, hit.user_agent) hit.ip_address AS hit_ip_address, hit.user_agent AS hit_user_agent FROM hit) AS anon_1 

which is even a bit closer to what you wanted.

like image 62
RedNaxel Avatar answered Sep 22 '22 23:09

RedNaxel


The exact query can be produced using the tuple_() construct:

session.query(     func.count(distinct(tuple_(Hit.ip_address, Hit.user_agent)))).scalar() 
like image 24
Ilja Everilä Avatar answered Sep 21 '22 23:09

Ilja Everilä