I have a database that stores phone call records. Each phone call record has a start time and an end time. I want to find out what is the maximum amount of phone calls that are simultaneously happening in order to know if we have exceed the amount of available phone lines in our phone bank. How could I go about solving this problem?
Disclaimer: I'm writing my answer based on the (excelent) following post:
https://www.itprotoday.com/sql-server/calculating-concurrent-sessions-part-3 (Part1 and 2 are recomended also)
The first thing to understand here with that problem is that most of the current solutions found in the internet can have basically two issues
The common performance problem in solutions like the proposed by Unreasons is a cuadratic solution, for each call you need to check all the other calls if they are overlaped.
there is an algoritmical linear common solution that is list all the "events" (start call and end call) ordered by date, and add 1 for a start and substract 1 for a hang-up, and remember the max. That can be implemented easily with a cursor (solution proposed by Hafhor seems to be in that way) but cursors are not the most efficient ways to solve problems.
The referenced article has excelent examples, differnt solutions, performance comparison of them. The proposed solution is:
WITH C1 AS
(
SELECT starttime AS ts, +1 AS TYPE,
ROW_NUMBER() OVER(ORDER BY starttime) AS start_ordinal
FROM Calls
UNION ALL
SELECT endtime, -1, NULL
FROM Calls
),
C2 AS
(
SELECT *,
ROW_NUMBER() OVER( ORDER BY ts, TYPE) AS start_or_end_ordinal
FROM C1
)
SELECT MAX(2 * start_ordinal - start_or_end_ordinal) AS mx
FROM C2
WHERE TYPE = 1
Explanation
suppose this set of data
+-------------------------+-------------------------+
| starttime | endtime |
+-------------------------+-------------------------+
| 2009-01-01 00:02:10.000 | 2009-01-01 00:05:24.000 |
| 2009-01-01 00:02:19.000 | 2009-01-01 00:02:35.000 |
| 2009-01-01 00:02:57.000 | 2009-01-01 00:04:04.000 |
| 2009-01-01 00:04:12.000 | 2009-01-01 00:04:52.000 |
+-------------------------+-------------------------+
This is a way to implement with a query the same idea, adding 1 for each starting of a call and substracting 1 for each ending.
SELECT starttime AS ts, +1 AS TYPE,
ROW_NUMBER() OVER(ORDER BY starttime) AS start_ordinal
FROM Calls
this part of the C1 CTE will take each starttime of each call and number it
+-------------------------+------+---------------+
| ts | TYPE | start_ordinal |
+-------------------------+------+---------------+
| 2009-01-01 00:02:10.000 | 1 | 1 |
| 2009-01-01 00:02:19.000 | 1 | 2 |
| 2009-01-01 00:02:57.000 | 1 | 3 |
| 2009-01-01 00:04:12.000 | 1 | 4 |
+-------------------------+------+---------------+
Now this code
SELECT endtime, -1, NULL
FROM Calls
Will generate all the "endtimes" without row numbering
+-------------------------+----+------+
| endtime | | |
+-------------------------+----+------+
| 2009-01-01 00:02:35.000 | -1 | NULL |
| 2009-01-01 00:04:04.000 | -1 | NULL |
| 2009-01-01 00:04:52.000 | -1 | NULL |
| 2009-01-01 00:05:24.000 | -1 | NULL |
+-------------------------+----+------+
Now making the UNION to have the full C1 CTE definition, you will have both tables mixed
+-------------------------+------+---------------+
| ts | TYPE | start_ordinal |
+-------------------------+------+---------------+
| 2009-01-01 00:02:10.000 | 1 | 1 |
| 2009-01-01 00:02:19.000 | 1 | 2 |
| 2009-01-01 00:02:57.000 | 1 | 3 |
| 2009-01-01 00:04:12.000 | 1 | 4 |
| 2009-01-01 00:02:35.000 | -1 | NULL |
| 2009-01-01 00:04:04.000 | -1 | NULL |
| 2009-01-01 00:04:52.000 | -1 | NULL |
| 2009-01-01 00:05:24.000 | -1 | NULL |
+-------------------------+------+---------------+
C2 is computed sorting and numbering C1 with a new column
C2 AS
(
SELECT *,
ROW_NUMBER() OVER( ORDER BY ts, TYPE) AS start_or_end_ordinal
FROM C1
)
+-------------------------+------+-------+--------------+
| ts | TYPE | start | start_or_end |
+-------------------------+------+-------+--------------+
| 2009-01-01 00:02:10.000 | 1 | 1 | 1 |
| 2009-01-01 00:02:19.000 | 1 | 2 | 2 |
| 2009-01-01 00:02:35.000 | -1 | NULL | 3 |
| 2009-01-01 00:02:57.000 | 1 | 3 | 4 |
| 2009-01-01 00:04:04.000 | -1 | NULL | 5 |
| 2009-01-01 00:04:12.000 | 1 | 4 | 6 |
| 2009-01-01 00:04:52.000 | -1 | NULL | 7 |
| 2009-01-01 00:05:24.000 | -1 | NULL | 8 |
+-------------------------+------+-------+--------------+
And there is where the magic occurs, at any time the result of #start - #ends is the amount of cocurrent calls at this moment.
for each Type = 1 (start event) we have the #start value in the 3rd column. and we also have the #start + #end (in the 4th column)
#start_or_end = #start + #end
#end = (#start_or_end - #start)
#start - #end = #start - (#start_or_end - #start)
#start - #end = 2 * #start - #start_or_end
so in SQL:
SELECT MAX(2 * start_ordinal - start_or_end_ordinal) AS mx
FROM C2
WHERE TYPE = 1
In this case with the prposed set of calls, the result is 2.
In the proposed article, there is a little improvment to have a grouped result by for example a service or a "phone company" or "phone central" and this idea can also be used to group for example by time slot and have the maximum concurrency hour by hour in a given day.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With