Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL cumulative sum grouped by date

I know there have been a few posts related to this, but my case is a little bit different and I wanted to get some help on this.

I need to pull some data out of the database that is a cumulative count of interactions by day. currently this is what i have

SELECT
   e.Date AS e_date,
   count(e.ID) AS num_interactions
FROM example AS e
JOIN example e1 ON e1.Date <= e.Date
GROUP BY e.Date;

The output of this is close to what I want but not exactly what I need.

The problem I'm having is the dates are stored with the hour minute and second that the interaction happened, so the group by is not grouping days together.

This is what the output looks like.

enter image description here

On 12-23 theres 5 interactions but its not grouped because the time stamp is different. So I need to find a way to ignore the timestamp and just look at the day.

If I try GROUP BY DAY(e.Date) it groups the data by the day only (i.e everything that happened on the 1st of any month is grouped into one row) and the output is not what I want at all.

enter image description here

GROUP BY DAY(e.Date), MONTH(e.Date) is splitting it up by month and the day of the month, but again the count is off.

enter image description here

I'm not a MySQL expert at all so I'm puzzled on what i'm missing

like image 790
John Ruddell Avatar asked Mar 09 '14 00:03

John Ruddell


People also ask

How do you do cumulative sum in MySQL?

To create a cumulative sum column in MySQL, you need to create a variable and set to value to 0. Cumulative sum increments the next value step by step with current value.

What is meant by cumulative sum?

Cumulative sums, or running totals, are used to display the total sum of data as it grows with time (or any other series or progression). This lets you view the total contribution so far of a given measure against time.

What is running total?

A running total is the cumulative sum of a value and all previous values in the column. For example, imagine you are in sales and storing information about the number of items sold on a particular day. You might want to calculate a running total, the total number of items sold up to a specific date.


1 Answers

New Answer

At first, I didn't understand you were trying to do a running total. Here is how that would look:

SET @runningTotal = 0;
SELECT 
    e_date,
    num_interactions,
    @runningTotal := @runningTotal + totals.num_interactions AS runningTotal
FROM
(SELECT 
    DATE(eDate) AS e_date,
    COUNT(*) AS num_interactions
FROM example AS e
GROUP BY DATE(e.Date)) totals
ORDER BY e_date;

Original Answer

You could be getting duplicates because of your join. Maybe e1 has more than one match for some rows which is inflating your count. Either that or the comparison in your join is also comparing the seconds, which is not what you expect.

Anyhow, instead of chopping the datetime field into days and months, just strip the time from it. Here is how you do that.

SELECT
   DATE(e.Date) AS e_date,
   count(e.ID) AS num_interactions
FROM example AS e
JOIN example e1 ON DATE(e1.Date) <= DATE(e.Date)
GROUP BY DATE(e.Date);
like image 173
clhereistian Avatar answered Sep 26 '22 00:09

clhereistian