Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create range bins from SQL Server table for histograms

Tags:

I have the following table in SQL Server:

----------------------------- ID       Age          Gender 1          30           F 2          35           M 3          32           M 4          18           F 5          21           F 

What I need to do is to execute a query that will group the records in given ranges and count the occurences. The results need to be displayed later in a histogram chart (bar chart). I tried a query similar to the following:

SELECT  count(CASE WHEN Age>= 10 AND Age < 20 THEN 1 END) AS '10 - 20', count(CASE WHEN Age>= 21 AND Age < 30 THEN 1 END) AS '21 - 30', count(CASE WHEN Age>= 31 AND Age < 35 THEN 1 END) AS '31 - 35', count(CASE WHEN Age>= 36 AND Age < 40 THEN 1 END) AS '36 - 40', FROM (SELECT Age FROM Attendees) AS AgeGroups 

For the moment, this does the trick, but does not consider the gender column. It will result a single row which counts the frequency of every age group:

10-20     21-30     31-35     36-40   0         22        21        13 

If gender would be considered there should be two records shown, for each gender. I need to see something like:

Gender    10-20     21-30     31-35     36-40   M         0         12        9         6   F         0         10        12        7 

How should I approach this problem?

like image 484
user10901 Avatar asked Sep 30 '13 21:09

user10901


People also ask

Can we create histogram in SQL?

A histogram is a special type of column statistic that sorts values into buckets – as you might sort coins into buckets. Generating a histogram is a great way to understand the distribution of data. We'll look at multiple ways of generating histograms.

What is a bin or bucket SQL?

A histogram is an approximate representation of the distribution of numerical data. In other words, histograms show the number of data points that fall within a specified range of values (typically called “bins” or “buckets”).

What is data binning in SQL?

Binning, also known as discretization, is a way of putting similar data values into bins. We will accomplish this binning in T-SQL by using a CASE expression. In this tip, I have a table that contains 101,706 records. This table has an age column with age values from 0 to 100.


1 Answers

Simply add the Gender column to your SELECT and then do a GROUP BY.

SELECT  Gender, count(CASE WHEN Age>= 10 AND Age < 20 THEN 1 END) AS [10 - 20], count(CASE WHEN Age>= 21 AND Age < 30 THEN 1 END) AS [21 - 30], count(CASE WHEN Age>= 31 AND Age < 35 THEN 1 END) AS [31 - 35], count(CASE WHEN Age>= 36 AND Age < 40 THEN 1 END) AS [36 - 40] FROM Attendees AS AgeGroups GROUP BY Gender 
like image 72
Lamak Avatar answered Nov 12 '22 00:11

Lamak