Count the number of rows in 30 day bins

Tags:

Each row in my table has a date time stamp, and I wish to query the database from now, to count how many rows are in the last 30 days, the 30 days before that and so on. Until there is a 30 day bin going back to the start of the table.

I have successfully carried out this query by using Python and making several queries. But I'm almost certain that it can be done in one single MySQL query.

578

asked Dec 30 '12 10:12

seanieb

3 Answers

No stored procedures, temporary tables, only one query, and an efficient execution plan given an index on the date column:

select

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30 + 30 - 1
  ) as "period starting",

  subdate(
    '2012-12-31',
    floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30
  ) as "period ending",

  count(*)

from
  YOURTABLE
group by floor(dateDiff('2012-12-31', dateStampColumn) / 30);

It should be pretty obvious what is happening here, except for this incantation:

floor(dateDiff('2012-12-31', dateStampColumn) / 30)

That expression appears several times, and it evaluates to the number of 30-day periods ago dateStampColumn is. dateDiff returns the difference in days, divide it by 30 to get it in 30-day periods, and feed it all to floor() to round it to an integer. Once we have this number, we can GROUP BY it, and further we do a bit of math to translate this number back into the starting and ending dates of the period.

Replace '2012-12-31' with now() if you prefer. Here's some sample data:

CREATE TABLE YOURTABLE
    (`Id` int, `dateStampColumn` datetime);

INSERT INTO YOURTABLE
    (`Id`, `dateStampColumn`)
VALUES
    (1, '2012-10-15 02:00:00'),
    (1, '2012-10-17 02:00:00'),
    (1, '2012-10-30 02:00:00'),
    (1, '2012-10-31 02:00:00'),
    (1, '2012-11-01 02:00:00'),
    (1, '2012-11-02 02:00:00'),
    (1, '2012-11-18 02:00:00'),
    (1, '2012-11-19 02:00:00'),
    (1, '2012-11-21 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-25 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-26 02:00:00'),
    (1, '2012-11-24 02:00:00'),
    (1, '2012-11-23 02:00:00'),
    (1, '2012-11-28 02:00:00'),
    (1, '2012-11-29 02:00:00'),
    (1, '2012-11-30 02:00:00'),
    (1, '2012-12-01 02:00:00'),
    (1, '2012-12-02 02:00:00'),
    (1, '2012-12-15 02:00:00'),
    (1, '2012-12-17 02:00:00'),
    (1, '2012-12-18 02:00:00'),
    (1, '2012-12-19 02:00:00'),
    (1, '2012-12-21 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-25 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-26 02:00:00'),
    (1, '2012-12-24 02:00:00'),
    (1, '2012-12-23 02:00:00'),
    (1, '2012-12-31 02:00:00'),
    (1, '2012-12-30 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-28 02:00:00'),
    (1, '2012-12-30 02:00:00');

And the result:

period starting     period ending   count(*)
2012-12-02          2012-12-31      17
2012-11-02          2012-12-01      14
2012-10-03          2012-11-01      5

period endpoints are inclusive.

Play with this in SQL Fiddle.

There's a bit of potential goofiness in that any 30 day period with zero matching rows will not be included in the result. If you could join this against a table of periods, that could be eliminated. However, MySQL doesn't have anything like PostgreSQL's generate_series(), so you'd have to deal with it in your application or try this clever hack.

130

answered Oct 02 '22 05:10

Phil Frost

If you just need to count intervals where there's at least one row, you could use this:

select
  datediff(curdate(), `date`) div 30 as block,
  count(*) as rows_per_block
from
  your_table
group by
  block

And this also shows the start date and the end date:

select
  datediff(curdate(), d) div 30 as block,
  date_sub(curdate(),
           INTERVAL (datediff(curdate(), `date`) div 30)*30 DAY) as start_block,
  date_sub(curdate(),
           INTERVAL (1+datediff(curdate(), `date`) div 30)*30-1 DAY) as end_block,
  count(*)
from your_table
group by block

but if you also need to show all intervals, you could use a solution like this:

select
  num,
  date_sub(curdate(),
           INTERVAL (num+1)*30-1 DAY) as start_block,
  date_sub(curdate(),
           INTERVAL num*30 DAY) as end_block,
  count(`date`)
from
  numbers left join your_table
  on `date` between date_sub(curdate(),
           INTERVAL (num+1)*30-1 DAY)  and
  date_sub(curdate(),
           INTERVAL num*30 DAY)
where num<=(datediff(curdate(), (select min(`date`) from your_table) ) div 30)
group by num

but this requires that you have a numbers table already prepared, or see fiddle here for a solution without numbers table.

answered Oct 02 '22 05:10

fthiella

Try this:

SELECT 
  DATE_FORMAT(t1.`Date`, '%Y-%m-%d'),
  COUNT(t2.Id)
FROM 
(
  SELECT SUBDATE(CURDATE(), ID) `Date`
  FROM
  (
    SELECT  t2.digit * 10 + t1.digit + 1 AS id
    FROM         TEMP AS t1
    CROSS JOIN TEMP AS t2
  ) t 
  WHERE Id <= 30 
) t1
LEFT JOIN YOURTABLE t2 ON DATE(t1.`Date`) = DATE(t2.dateStampColumn)
GROUP BY t1.`Date`;

SQL Fiddle Demo

But, you will need to create a temp table Temp like so:

CREATE TABLE TEMP 
(Digit int);
INSERT INTO Temp VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);

answered Oct 02 '22 06:10

Mahmoud Gamal

Related questions
                            
                                How to query a MySql table to display the root and its subchild.
                            
                                Change column name without recreating the MySQL table
                            
                                After I create my tables using SQLAlchemy, how can I add additional columns to it?
                            
                                how to test your mysql queries?
                            
                                What MySQL driver for Node.js works on Windows?
                            
                                Create a new MySQL database using Workbench
                            
                                What datatype is best for storing articles in SQL database?
                            
                                MySQL full text search plural/singular form of words
                            
                                How to weight 'ORDER BY' in mysql?
                            
                                MySQL: What does the equal colon =: mean in an update statement?
                            
                                Unknown abstract schema type
                            
                                1130 Host 'amazon-ec2-ip' is not allowed to connect to this MySQL server
                            
                                SQL Group By and min (MySQL)
                            
                                single sql query to perform some group by
                            
                                Django databases and threads
                            
                                Returning Array of Objects from PDO
                            
                                Best way to store passwords in a mysql database? [duplicate]
                            
                                Make php files hidden from outside world
                            
                                MySQL how to make negative results possible when subtracting unsigned values?
                            
                                Connecting to mysql using ssl in Zend framework

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Count the number of rows in 30 day bins

Tags:

select

mysql

count

datediff