Calculate number of concurrent events in SQL

Tags:

I have a table that holds phone calls, with the following fields:

ID
STARTTIME
ENDTIME
STATUS
CALL_FROM
CALL_TO

There are 2,9 million records loaded into a local PostgreSQL database. I added indexes on ID (unique index), starttime and endtime.

Searching on stackoverflow, I found some useful SQL and modified it to what I think logically should work. The problem is that the query runs for many hours and never returns:

Click to copy

SELECT T1.sid, count(*) as CountSimultaneous
FROM calls_nov T1, calls_nov T2
WHERE
     T1.StartTime between T2.StartTime and T2.EndTime
     and T1.StartTime between '2011-11-02' and '2011-11-03'
GROUP BY
     T1.sid
ORDER BY CountSimultaneous DESC;

Can someone please either suggest a way to fix the query/index so that it actually works or suggest another way to calculate concurrent calls?

EDIT:

Explain plan:

Click to copy

Sort  (cost=11796758237.81..11796758679.47 rows=176663 width=35)
  Sort Key: (count(*))
  ->  GroupAggregate  (cost=0.00..11796738007.56 rows=176663 width=35)
        ->  Nested Loop  (cost=0.00..11511290152.45 rows=57089217697 width=35)

Table creation script:

Click to copy

CREATE TABLE calls_nov (
  sid varchar,
  starttime timestamp, 
  endtime timestamp, 
  call_to varchar, 
  call_from varchar, 
  status varchar);

Index creation:

Click to copy

CREATE UNIQUE INDEX sid_unique_index on calls_nov (sid);

CREATE INDEX starttime_index on calls_nov (starttime);

CREATE INDEX endtime_index on calls_nov (endtime);

485

asked Jan 04 '12 20:01

Sologoub

1 Answers

Here's what the possible overlaps look like, where 'A' is the "reference" interval. Note that the query below (far, far below) doesn't give the same result as any of the answers yet posted.

Click to copy

-- A            |------|
-- B |-|
-- C        |---|
-- D          |---|
-- E             |---|
-- F               |---|
-- G                 |---|
-- H                   |---|
-- I                       |---|

"B" doesn't overlap "A" at all. "C" abuts it. {"D", "E", "F", "G"} overlaps it. "H" abuts it. "I" doesn't overlap it at all.

Click to copy

create table calls_nov (
  sid varchar(5) primary key,
  starttime timestamp not null,
  endtime timestamp not null
);  

insert into calls_nov values
('A', '2012-01-04 08:00:00', '2012-01-04 08:00:10'),
('B', '2012-01-04 07:50:00', '2012-01-04 07:50:03'),
('C', '2012-01-04 07:59:57', '2012-01-04 08:00:00'),
('D', '2012-01-04 07:59:57', '2012-01-04 08:00:03'),
('E', '2012-01-04 08:00:01', '2012-01-04 08:00:04'),
('F', '2012-01-04 08:00:07', '2012-01-04 08:00:10'),
('G', '2012-01-04 08:00:07', '2012-01-04 08:00:13'),
('H', '2012-01-04 08:00:10', '2012-01-04 08:00:13'),
('I', '2012-01-04 08:00:15', '2012-01-04 08:00:18');

You can see all the overlapping intervals like this. (I just used to_char() to make it easy to see all the data. You can omit it in production.)

Click to copy

select t1.sid, to_char(t1.starttime, 'HH12:MI:SS'), 
               to_char(t1.endtime,   'HH12:MI:SS'), 
       t2.sid, to_char(t2.starttime, 'HH12:MI:SS'), 
               to_char(t2.endtime,   'HH12:MI:SS')
from calls_nov t1
inner join calls_nov t2 on (t2.starttime, t2.endtime) 
                  overlaps (t1.starttime, t1.endtime) 
order by t1.sid, t2.sid;

A   08:00:00   08:00:10   A   08:00:00   08:00:10
A   08:00:00   08:00:10   D   07:59:57   08:00:03
A   08:00:00   08:00:10   E   08:00:01   08:00:04
A   08:00:00   08:00:10   F   08:00:07   08:00:10
A   08:00:00   08:00:10   G   08:00:07   08:00:13
B   07:50:00   07:50:03   B   07:50:00   07:50:03
C   07:59:57   08:00:00   C   07:59:57   08:00:00
C   07:59:57   08:00:00   D   07:59:57   08:00:03
D   07:59:57   08:00:03   A   08:00:00   08:00:10
D   07:59:57   08:00:03   C   07:59:57   08:00:00
D   07:59:57   08:00:03   D   07:59:57   08:00:03
D   07:59:57   08:00:03   E   08:00:01   08:00:04
E   08:00:01   08:00:04   A   08:00:00   08:00:10
E   08:00:01   08:00:04   D   07:59:57   08:00:03
E   08:00:01   08:00:04   E   08:00:01   08:00:04
F   08:00:07   08:00:10   A   08:00:00   08:00:10
F   08:00:07   08:00:10   F   08:00:07   08:00:10
F   08:00:07   08:00:10   G   08:00:07   08:00:13
G   08:00:07   08:00:13   A   08:00:00   08:00:10
G   08:00:07   08:00:13   F   08:00:07   08:00:10
G   08:00:07   08:00:13   G   08:00:07   08:00:13
G   08:00:07   08:00:13   H   08:00:10   08:00:13
H   08:00:10   08:00:13   G   08:00:07   08:00:13
H   08:00:10   08:00:13   H   08:00:10   08:00:13
I   08:00:15   08:00:18   I   08:00:15   08:00:18

You can see from this table that "A" should count 5, including itself. "B" should count 1; it overlaps itself, but no other intervals overlap it. That seems the right thing to do.

Counting is straightforward, but runs like a ruptured turtle. That's because evaluating an overlap takes a lot of work.

Click to copy

select t1.sid, count(t2.sid) as num_concurrent
from calls_nov t1
inner join calls_nov t2 on (t2.starttime, t2.endtime) 
                  overlaps (t1.starttime, t1.endtime) 
group by t1.sid
order by num_concurrent desc;

A   5
D   4
G   4
E   3
F   3
H   2
C   2
I   1
B   1

To get better performance, you can use the "table" above in a common table expression, and count based on that.

Click to copy

with interval_table as (
select t1.sid as sid_1, t1.starttime, t1.endtime,
       t2.sid as sid_2, t2.starttime, t2.endtime
from calls_nov t1
inner join calls_nov t2 on (t2.starttime, t2.endtime) 
                  overlaps (t1.starttime, t1.endtime) 
order by t1.sid, t2.sid
) 
select sid_1, count(sid_2) as num_concurrent
from interval_table
group by sid_1
order by num_concurrent desc;

122

answered Oct 19 '22 16:10

Mike Sherrill 'Cat Recall'

Related questions
                            
                                Select from Table1, Table2
                            
                                How do I find elements in an array in BigQuery
                            
                                Why does a SQL join choose a sub-optimal query plan?
                            
                                Flattening intersecting timespans
                            
                                Is it possible to store '' (empty string) as a non NULL value in the database?
                            
                                How to SELECT using both wildcards (LIKE) and array (IN)?
                            
                                Determine MAX Decimal Scale Used on a Column
                            
                                How to calculate the slope in SQL
                            
                                INSTEAD OF TRIGGER, Would it infinitely loop?
                            
                                How to have GROUP BY and COUNT include zero sums?
                            
                                REPLACE INTO, does it re-use the PRIMARY KEY?
                            
                                SELECT DISTINCT not working
                            
                                MySQL many-to-many relationship with FOREIGN KEYS
                            
                                Why hibernate always call "update" statement after using "select" statement in MySQL?
                            
                                Can we write case statement without having else statement
                            
                                How to query for empty array in JSONB?
                            
                                How to optimize Core Data query for full text search
                            
                                SQL - How to ALTER COLUMN on a computed column
                            
                                How to find Expr#### in Execution Plan
                            
                                Microsoft SQL Compact Edition rename column

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculate number of concurrent events in SQL

Tags:

performance

sql

timestamp

postgresql

query-optimization

Sologoub

People also ask

1 Answers

Mike Sherrill 'Cat Recall'

Recent Activity

Donate For Us