Each row in my table has a date time stamp, and I wish to query the database from now, to count how many rows are in the last 30 days, the 30 days before that and so on. Until there is a 30 day bin going back to the start of the table.
I have successfully carried out this query by using Python and making several queries. But I'm almost certain that it can be done in one single MySQL query.
If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.
The COUNT() function returns the number of rows that matches a specified criterion.
To count the number of rows, use the id column which stores unique values (in our example we use COUNT(id) ). Next, use the GROUP BY clause to group records according to columns (the GROUP BY category above). After using GROUP BY to filter records with aggregate functions like COUNT, use the HAVING clause.
To counts all of the rows in a table, whether they contain NULL values or not, use COUNT(*). That form of the COUNT() function basically returns the number of rows in a result set returned by a SELECT statement.
No stored procedures, temporary tables, only one query, and an efficient execution plan given an index on the date column:
select
subdate(
'2012-12-31',
floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30 + 30 - 1
) as "period starting",
subdate(
'2012-12-31',
floor(dateDiff('2012-12-31', dateStampColumn) / 30) * 30
) as "period ending",
count(*)
from
YOURTABLE
group by floor(dateDiff('2012-12-31', dateStampColumn) / 30);
It should be pretty obvious what is happening here, except for this incantation:
floor(dateDiff('2012-12-31', dateStampColumn) / 30)
That expression appears several times, and it evaluates to the number of 30-day periods ago dateStampColumn
is. dateDiff
returns the difference in days, divide it by 30 to get it in 30-day periods, and feed it all to floor()
to round it to an integer. Once we have this number, we can GROUP BY
it, and further we do a bit of math to translate this number back into the starting and ending dates of the period.
Replace '2012-12-31'
with now()
if you prefer. Here's some sample data:
CREATE TABLE YOURTABLE
(`Id` int, `dateStampColumn` datetime);
INSERT INTO YOURTABLE
(`Id`, `dateStampColumn`)
VALUES
(1, '2012-10-15 02:00:00'),
(1, '2012-10-17 02:00:00'),
(1, '2012-10-30 02:00:00'),
(1, '2012-10-31 02:00:00'),
(1, '2012-11-01 02:00:00'),
(1, '2012-11-02 02:00:00'),
(1, '2012-11-18 02:00:00'),
(1, '2012-11-19 02:00:00'),
(1, '2012-11-21 02:00:00'),
(1, '2012-11-25 02:00:00'),
(1, '2012-11-25 02:00:00'),
(1, '2012-11-26 02:00:00'),
(1, '2012-11-26 02:00:00'),
(1, '2012-11-24 02:00:00'),
(1, '2012-11-23 02:00:00'),
(1, '2012-11-28 02:00:00'),
(1, '2012-11-29 02:00:00'),
(1, '2012-11-30 02:00:00'),
(1, '2012-12-01 02:00:00'),
(1, '2012-12-02 02:00:00'),
(1, '2012-12-15 02:00:00'),
(1, '2012-12-17 02:00:00'),
(1, '2012-12-18 02:00:00'),
(1, '2012-12-19 02:00:00'),
(1, '2012-12-21 02:00:00'),
(1, '2012-12-25 02:00:00'),
(1, '2012-12-25 02:00:00'),
(1, '2012-12-26 02:00:00'),
(1, '2012-12-26 02:00:00'),
(1, '2012-12-24 02:00:00'),
(1, '2012-12-23 02:00:00'),
(1, '2012-12-31 02:00:00'),
(1, '2012-12-30 02:00:00'),
(1, '2012-12-28 02:00:00'),
(1, '2012-12-28 02:00:00'),
(1, '2012-12-30 02:00:00');
And the result:
period starting period ending count(*)
2012-12-02 2012-12-31 17
2012-11-02 2012-12-01 14
2012-10-03 2012-11-01 5
period endpoints are inclusive.
Play with this in SQL Fiddle.
There's a bit of potential goofiness in that any 30 day period with zero matching rows will not be included in the result. If you could join this against a table of periods, that could be eliminated. However, MySQL doesn't have anything like PostgreSQL's generate_series(), so you'd have to deal with it in your application or try this clever hack.
If you just need to count intervals where there's at least one row, you could use this:
select
datediff(curdate(), `date`) div 30 as block,
count(*) as rows_per_block
from
your_table
group by
block
And this also shows the start date and the end date:
select
datediff(curdate(), d) div 30 as block,
date_sub(curdate(),
INTERVAL (datediff(curdate(), `date`) div 30)*30 DAY) as start_block,
date_sub(curdate(),
INTERVAL (1+datediff(curdate(), `date`) div 30)*30-1 DAY) as end_block,
count(*)
from your_table
group by block
but if you also need to show all intervals, you could use a solution like this:
select
num,
date_sub(curdate(),
INTERVAL (num+1)*30-1 DAY) as start_block,
date_sub(curdate(),
INTERVAL num*30 DAY) as end_block,
count(`date`)
from
numbers left join your_table
on `date` between date_sub(curdate(),
INTERVAL (num+1)*30-1 DAY) and
date_sub(curdate(),
INTERVAL num*30 DAY)
where num<=(datediff(curdate(), (select min(`date`) from your_table) ) div 30)
group by num
but this requires that you have a numbers
table already prepared, or see fiddle here for a solution without numbers table.
Try this:
SELECT
DATE_FORMAT(t1.`Date`, '%Y-%m-%d'),
COUNT(t2.Id)
FROM
(
SELECT SUBDATE(CURDATE(), ID) `Date`
FROM
(
SELECT t2.digit * 10 + t1.digit + 1 AS id
FROM TEMP AS t1
CROSS JOIN TEMP AS t2
) t
WHERE Id <= 30
) t1
LEFT JOIN YOURTABLE t2 ON DATE(t1.`Date`) = DATE(t2.dateStampColumn)
GROUP BY t1.`Date`;
But, you will need to create a temp table Temp
like so:
CREATE TABLE TEMP
(Digit int);
INSERT INTO Temp VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With