Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

algorithm advice for finding maximum items within a time period

Tags:

sql

mysql

I have a database schema that is similar to the following:

| User   | Event         | Date
|--------|---------------|------
| 111    | Walked dog    | 2009-10-1
| 222    | Walked dog    | 2009-10-2
| 333    | Fed Fish      | 2009-10-5
| 222    | Did Laundry   | 2009-10-6
| 111    | Fed Fish      | 2009-10-7
| 111    | Walked dog    | 2009-10-18
| 222    | Walked dog    | 2009-10-19
| 111    | Fed Fish      | 2009-10-21

I would like to produce a query that returns the maximum number of times a user performs some action within a time period. For example, given a time period of 5 days, what is the maximum number of times user 111 walked the dog?

The most obvious solution would be to start at some zero point and move forward each day, summing up 5 day periods along the way, then taking the maximum total out of all the 5 day windows. the approach seems incredibly costly however.

I would appreciate any suggestions you may have.

EDIT 1:

Thanks for the comments / answers. To respond: - I'm using mySQL v5.0 - There could be any number of events per day (per any time period really) - @Paulo Santos: thanks,but like the comment points out, i need to find the window that produces the most results, the window itself can slide. - @Mark: this looks like an interesting solution, although i recall reading that mySQL does not support backing up or jumping ahead cursors.
- @orbMan: this looks promising. I don't fully understand it yet, but I will give this a try tonight. - @mjv: another promising solution. also looks complicated, but I will give it another look

thanks again!

like image 499
D.C. Avatar asked Mar 18 '10 19:03

D.C.


3 Answers

For you specific request I'd do something like:

SELECT User, Event, Count(*)
  FROM Table
 WHERE Date between @d1 and @d2
 Group by User, Event

Then it will return the number of time each user performed each task within the specified (@d1 and @d2) time frame.

like image 89
Paulo Santos Avatar answered Sep 25 '22 02:09

Paulo Santos


select top 1 x.Date as StartDate, DATEADD(day, 5, x.Date) as EndDate, COUNT(*) as Count
from Event e
inner join Event x on 1=1
where e.Date between x.Date and DATEADD(day, 5, x.Date)
    and e.Event = 'Walked dog'
group by x.Date, DATEADD(day, 5, x.Date)
order by Count desc

Output:

StartDate  EndDate    Count
---------- ---------- -----------
2009-10-01 2009-10-06 2
like image 31
D'Arcy Rittich Avatar answered Sep 24 '22 02:09

D'Arcy Rittich


Here's an alternative algorithm that is cursor based.

Start with two cursors, begin and end, both pointing at the initial row, and current count = 0, and current maximum = 0.

If DATE_DIFF(end.date, begin.date) is more than 5, advance the begin cursor one row. Subtract one from current count if the old row was 'walked the dog'.

If DATE_DIFF(end.date, begin.date) is not more than 5, advance the end cursor one row. Aadd one to current count if the new row is 'walked the dog'. If the current count is greater than current maximum, set current maximum to current count.

Continue until you have covered all rows in the range.

like image 37
Mark Byers Avatar answered Sep 23 '22 02:09

Mark Byers