Baselines - table:
date / time of call, duration of call
How to calculate "Busy hour"?
http://en.wikipedia.org/wiki/Busy_hour - In a communications system, the sliding 60-minute period during which occurs the maximum total traffic load in a given 24-hour period.
Busy hour means the continuous one-hour period of the day during which volume of traffic is greater than during any other continuous one-hour period of the same day.
The busy hour is important because it determines how much bandwidth is needed for the various components of the network. For example, the data connection between the ISP core and a neighborhood must be large enough to handle the busy hour – otherwise, bandwidth and speeds get slower.
For an extended period (as many days as you can be bothered to compute for) calculate the number of calls active at each minute of the day -- there are 1440 of them. Then for each minute of the day calculate the moving 60-minute sum of active calls, the minute for which this sum is maximised is the start of the busy hour (provided that you've calculated forward moving averages). Don't forget to wrap around at midnight.
This seems so simple that I suspect I've misunderstood your question.
This can be done in linear time, by using two pointers (Left and Right) traversing the sorted list of events. Actually, the traversing is done on a sorted list of call events, containing both call start events and call end events.
The left pointer starts from the leftmost start event. The right pointer moves to the latest event not later than (left ptr time + 1hr). We also maintain a sum variable that calculates the total of all call durations in the interval.
At each iteration we move the left pointer to the next event (updating the sum accordingly), and then we progress with the right pointer (updating the sum, again) until its end position (little less than 1 hr later).
The maximum value of sum is obtained at the busy hour window.
I didn't give details on how to update the sum variable when moving the pointers, but I believe it should not be complicated to do it in constant time.
The advantage of this algorithm is that it does not have the granularity problem, and that it is linear in the number of events.
--EDIT--
Actually this is an O(N*Log N) algorithm, since the input table does not provide the sorting of events that I assumed. We need to generate the sorted lists of events first
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With