I have a database schema that is similar to the following:
| User | Event | Date
|--------|---------------|------
| 111 | Walked dog | 2009-10-1
| 222 | Walked dog | 2009-10-2
| 333 | Fed Fish | 2009-10-5
| 222 | Did Laundry | 2009-10-6
| 111 | Fed Fish | 2009-10-7
| 111 | Walked dog | 2009-10-18
| 222 | Walked dog | 2009-10-19
| 111 | Fed Fish | 2009-10-21
I would like to produce a query that returns the maximum number of times a user performs some action within a time period. For example, given a time period of 5 days, what is the maximum number of times user 111 walked the dog?
The most obvious solution would be to start at some zero point and move forward each day, summing up 5 day periods along the way, then taking the maximum total out of all the 5 day windows. the approach seems incredibly costly however.
I would appreciate any suggestions you may have.
EDIT 1:
Thanks for the comments / answers. To respond:
- I'm using mySQL v5.0
- There could be any number of events per day (per any time period really)
- @Paulo Santos: thanks,but like the comment points out, i need to find the window that produces the most results, the window itself can slide.
- @Mark: this looks like an interesting solution, although i recall reading that mySQL does not support backing up or jumping ahead cursors.
- @orbMan: this looks promising. I don't fully understand it yet, but I will give this a try tonight.
- @mjv: another promising solution. also looks complicated, but I will give it another look
thanks again!
For you specific request I'd do something like:
SELECT User, Event, Count(*)
FROM Table
WHERE Date between @d1 and @d2
Group by User, Event
Then it will return the number of time each user performed each task within the specified (@d1
and @d2
) time frame.
select top 1 x.Date as StartDate, DATEADD(day, 5, x.Date) as EndDate, COUNT(*) as Count
from Event e
inner join Event x on 1=1
where e.Date between x.Date and DATEADD(day, 5, x.Date)
and e.Event = 'Walked dog'
group by x.Date, DATEADD(day, 5, x.Date)
order by Count desc
Output:
StartDate EndDate Count
---------- ---------- -----------
2009-10-01 2009-10-06 2
Here's an alternative algorithm that is cursor based.
Start with two cursors, begin and end, both pointing at the initial row, and current count = 0, and current maximum = 0.
If DATE_DIFF(end.date, begin.date) is more than 5, advance the begin cursor one row. Subtract one from current count if the old row was 'walked the dog'.
If DATE_DIFF(end.date, begin.date) is not more than 5, advance the end cursor one row. Aadd one to current count if the new row is 'walked the dog'. If the current count is greater than current maximum, set current maximum to current count.
Continue until you have covered all rows in the range.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With