I am trying to build a query that tells me how many distinct women and men there are in a given dataset. The person is identified by a number 'tel'. It is possible for the same 'tel' to appear multiple times, but that 'tel's gender should only be counted one time!
7136609221 - male
7136609222 - male
7136609223 - female
7136609228 - male
7136609222 - male
7136609223 - female
This example_dataset would yield the following.
Total unique gender count: 4
Total unique male count: 3
Total unique female count: 1
My attempted query:
SELECT COUNT(DISTINCT tel, gender) as gender_count, COUNT(DISTINCT tel, gender = 'male') as man_count, SUM(if(gender = 'female', 1, 0)) as woman_count FROM example_dataset;
There's actually two attempts in there. COUNT(DISTINCT tel, gender = 'male') as man_count
seems to just return the same as COUNT(DISTINCT tel, gender)
-- it doesn't take into account the qualifier there. And the SUM(if(gender = 'female', 1, 0))
counts all the female records, but is not filtered by DISTINCT tels.
To count distinct values, you can use distinct in aggregate function count(). The result i3 tells that we have 3 distinct values in the table.
To count the number of different values that are stored in a given column, you simply need to designate the column you pass in to the COUNT function as DISTINCT . When given a column, COUNT returns the number of values in that column. Combining this with DISTINCT returns only the number of unique (and non-NULL) values.
You can use the combination of the SUM and COUNTIF functions to count unique values in Excel. The syntax for this combined formula is = SUM(IF(1/COUNTIF(data, data)=1,1,0)). Here the COUNTIF formula counts the number of times each value in the range appears.
We can use SQL Count Function to return the number of rows in the specified condition. The syntax of the SQL COUNT function: COUNT ([ALL | DISTINCT] expression); By default, SQL Server Count Function uses All keyword.
Here's one option using a subquery with DISTINCT
:
SELECT COUNT(*) gender_count, SUM(IF(gender='male',1,0)) male_count, SUM(IF(gender='female',1,0)) female_count FROM ( SELECT DISTINCT tel, gender FROM example_dataset ) t
This will also work if you don't want to use a subquery:
SELECT COUNT(DISTINCT tel) gender_count, COUNT(DISTINCT CASE WHEN gender = 'male' THEN tel END) male_count, COUNT(DISTINCT CASE WHEN gender = 'female' THEN tel END) female_count FROM example_dataset
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With