(self) join by time intervals

Tags:

I have a table in an oracle database. The schema is

create table PERIODS
( 
  ID NUMBER, 
  STARTTIME TIMESTAMP, 
  ENDTIME TIMESTAMP, 
  TYPE VARCHAR2(100)
)

I have two different TYPE's: TYPEA and TYPEB. The have independent start and end times and they can overlap. What I would like to find are the periods of TYPEB that started, are fully contained or ended within a given period of TYPEA.

Here is what I came up with so far (with some sample data)

WITH mydata 
     AS (SELECT 100                                                    ID, 
                To_timestamp('2015-08-01 11:00', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 11:20', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEA'                                                TYPE 
         FROM   dual 
         UNION ALL 
         SELECT 110                                                    ID, 
                To_timestamp('2015-08-01 11:30', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 11:50', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEA'                                                TYPE 
         FROM   dual 
         UNION ALL 
         SELECT 120                                                    ID, 
                To_timestamp('2015-08-01 12:00', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 12:20', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEA'                                                TYPE 
         FROM   dual 
         UNION ALL 
         SELECT 105                                                    ID, 
                To_timestamp('2015-08-01 10:55', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEB'                                                TYPE 
         FROM   dual 
         UNION ALL 
         SELECT 108                                                    ID, 
                To_timestamp('2015-08-01 11:05', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEB'                                                TYPE 
         FROM   dual 
         UNION ALL 
         SELECT 111                                                    ID, 
                To_timestamp('2015-08-01 11:15', 'YYYY-MM-DD HH24:MI') STARTTIME, 
                To_timestamp('2015-08-01 12:25', 'YYYY-MM-DD HH24:MI') ENDTIME, 
                'TYPEB'                                                TYPE 
         FROM   dual), 
     typeas 
     AS (SELECT starttime, 
                endtime 
         FROM   mydata 
         WHERE  TYPE = 'TYPEA'), 
     typebs 
     AS (SELECT id, 
                starttime, 
                endtime 
         FROM   mydata 
         WHERE  TYPE = 'TYPEB') 
SELECT id 
FROM   typebs b 
       join typeas a 
         ON ( b.starttime BETWEEN a.starttime AND a.endtime ) 
             OR ( b.starttime BETWEEN a.starttime AND a.endtime 
                  AND b.endtime BETWEEN a.starttime AND a.endtime ) 
             OR ( b.endtime BETWEEN a.starttime AND a.endtime ) 
ORDER  BY id;

This seems to work in principle, the result from the query above is

        ID
----------
       105
       108
       111

so it selects the three periods TYPEB that started or ended inside the first TYPEA period.

The problem is that the table has about 200k entries and already at this size the above query is quite slow --- which is very surprising to me as the number of both TYPEA and TYPEB entries is quite low ( 1-2k )

Is there a more efficient way to perform this type of self join? Did I miss something else in my query?

518

asked Aug 01 '15 20:08

Erik

1 Answers

Maybe worth a try (also you need to write the most restricting conditions in the end in oracle, don't ask me why or believe me, better do your own performance tests):

SELECT
   p.id
FROM
   periods p
WHERE
   EXISTS(SELECT * FROM periods q WHERE
      (p.startTime BETWEEN q.startTime AND q.endTime
      OR p.endTime BETWEEN q.startTime AND q.endTime
      OR p.startTime < q.startTime AND p.endTime > q.endTime -- overlapping correction, remove if not needed
      ) AND q.type = 'TYPEA'
   ) AND p.type = 'TYPEB'
ORDER BY
   p.id
;

115

answered Sep 30 '22 17:09

maraca

Related questions
                            
                                Storing trillions of document similarities
                            
                                Synchronizing databases between a local and remote development environment
                            
                                Sqlite3 Module in Python far Slower SELECT than in Shell
                            
                                Which tables were affected during single query run by triggers cascade
                            
                                CHOOSE() not working as intended [duplicate]
                            
                                How to count rows in a while loop using pdo fetch
                            
                                Full Text Search Finding Word Only Occasionally
                            
                                Why would an indexed column return results slowly when querying for `IS NULL`?
                            
                                Sql function - NANVL - different behaviour
                            
                                CursorWindowAllocationException in standard ORMLite method
                            
                                Performance when using batch mode of Qt / MySQL
                            
                                Creating a connection with Java ODBC results in a java.sql.SQLException: Invalid Cursor Type exception
                            
                                MySQL Update after N rows
                            
                                sql compute difference between 2 rows
                            
                                Missing expression error in application used to submit SQL but when works fine in SQL Developer
                            
                                MySQL Reading from stream failed
                            
                                How can I optimize this PostgreSQL query that updates every row?
                            
                                Postgres SQL to query array text[] in specific element
                            
                                how to search and sort data but exclude prefix words in sql
                            
                                Lost connection to MySQL server during query

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

(self) join by time intervals

Tags:

performance

sql

oracle

query-optimization

self-join

Erik

People also ask

1 Answers

maraca

Recent Activity

Donate For Us