Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL GROUP BY age range including null ranges

Tags:

mysql

I'm trying to count the number of people by age ranges, and I can almost do it with 2 problems:

  1. If there are no people in a given age range (NULL), then that age range does not appear in the results. For example, in my data there's no entries for "Over 80" so that date range does not appear. Basically, it looks like a mistake in the programming when there are missing date ranges.

  2. I'd like to order the results in a specific way. In the query below, because the ORDER BY is by age_range, the results for '20 - 29' come before the results for 'Under 20'.

Here's a sample of the db table "inquiries":

inquiry_id  birth_date
1           1960-02-01
2           1962-03-04
3           1970-03-08
4           1980-03-02
5           1990-02-08

Here's the query:

SELECT
    CASE
        WHEN age < 20 THEN 'Under 20'
        WHEN age BETWEEN 20 and 29 THEN '20 - 29'
        WHEN age BETWEEN 30 and 39 THEN '30 - 39'
        WHEN age BETWEEN 40 and 49 THEN '40 - 49'
        WHEN age BETWEEN 50 and 59 THEN '50 - 59'
        WHEN age BETWEEN 60 and 69 THEN '60 - 69'
        WHEN age BETWEEN 70 and 79 THEN '70 - 79'
        WHEN age >= 80 THEN 'Over 80'
        WHEN age IS NULL THEN 'Not Filled In (NULL)'
    END as age_range,
    COUNT(*) AS count

    FROM (SELECT TIMESTAMPDIFF(YEAR, birth_date, CURDATE()) AS age FROM inquiries) as derived

    GROUP BY age_range

    ORDER BY age_range

Here's a simple solution based on the suggestion by Wrikken:

SELECT
    SUM(IF(age < 20,1,0)) as 'Under 20',
    SUM(IF(age BETWEEN 20 and 29,1,0)) as '20 - 29',
    SUM(IF(age BETWEEN 30 and 39,1,0)) as '30 - 39',
    SUM(IF(age BETWEEN 40 and 49,1,0)) as '40 - 49',
    SUM(IF(age BETWEEN 50 and 59,1,0)) as '50 - 59',
    SUM(IF(age BETWEEN 60 and 69,1,0)) as '60 - 69',
    SUM(IF(age BETWEEN 70 and 79,1,0)) as '70 - 79',
    SUM(IF(age >=80, 1, 0)) as 'Over 80',
    SUM(IF(age IS NULL, 1, 0)) as 'Not Filled In (NULL)'

FROM (SELECT TIMESTAMPDIFF(YEAR, birth_date, CURDATE()) AS age FROM inquiries) as derived
like image 617
Mitchell Avatar asked Jul 14 '10 15:07

Mitchell


People also ask

How do I create an age group in SQL?

yyyy') birth_date from dual ) select age_group , count(*) from( select case when trunc( months_between(sysdate, birth_date) / 12 ) <= 17 then '0-17' when trunc( months_between(sysdate, birth_date) / 12 ) <= 24 then '18-24' when trunc( months_between(sysdate, birth_date) / 12 ) <= 34 then '25-34' when trunc( ...

Can you use DESC with GROUP BY?

Parser does not accept ASC or DESC keyword after column specification for a GROUP BY clause. So a syntax error is thrown if a GROUP BY column is followed by ASC or DESC keyword.

Can we use GROUP BY with where clause in MySQL?

The GROUP BY clause groups a set of rows into a set of summary rows by values of columns or expressions. The GROUP BY clause returns one row for each group. In other words, it reduces the number of rows in the result set. In this syntax, you place the GROUP BY clause after the FROM and WHERE clauses.

How do I calculate age in MySQL?

In MySQL, you can use the inbuilt TIMESTAMPDIFF() function to compute the difference between two dates and return a value either in days, months or years.


2 Answers

An alternative to the range table (which has my preference), a single-row answer could be:

SELECT
    SUM(IF(age < 20,1,0)) as 'Under 20',
    SUM(IF(age BETWEEN 20 and 29,1,0)) as '20 - 29',
    SUM(IF(age BETWEEN 30 and 39,1,0)) as '30 - 39',
    SUM(IF(age BETWEEN 40 and 49,1,0)) as '40 - 49',
...etc.
FROM inquiries;
like image 103
Wrikken Avatar answered Oct 12 '22 10:10

Wrikken


One way of ordering the results would be introducing a column in the select statement and giving it a rank value of the way you want your results to be ordered with the rest and then order by that row, for example

SELECT
CASE
    WHEN age < 20 THEN 'Under 20'
    WHEN age BETWEEN 20 and 29 THEN '20 - 29'
    WHEN age BETWEEN 30 and 39 THEN '30 - 39'
    WHEN age BETWEEN 40 and 49 THEN '40 - 49'
    WHEN age BETWEEN 50 and 59 THEN '50 - 59'
    WHEN age BETWEEN 60 and 69 THEN '60 - 69'
    WHEN age BETWEEN 70 and 79 THEN '70 - 79'
    WHEN age >= 80 THEN 'Over 80'
    WHEN age IS NULL THEN 'Not Filled In (NULL)'
END as age_range,
COUNT(*) AS count,
 CASE
    WHEN age < 20 THEN 1
    WHEN age BETWEEN 20 and 29 THEN 2
    WHEN age BETWEEN 30 and 39 THEN 3
    WHEN age BETWEEN 40 and 49 THEN 4
    WHEN age BETWEEN 50 and 59 THEN 5
    WHEN age BETWEEN 60 and 69 THEN 6
    WHEN age BETWEEN 70 and 79 THEN 7
    WHEN age >= 80 THEN 8
    WHEN age IS NULL THEN 9
END as ordinal

FROM (SELECT TIMESTAMPDIFF(YEAR, birth_date, CURDATE()) AS age FROM inquiries) as derived

GROUP BY age_range

ORDER BY ordinal
like image 29
Mike Aono Avatar answered Oct 12 '22 11:10

Mike Aono