Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to perform a time-sequenced (temporal) LEFT OUTER JOIN

TL;DR In the book Developing Time-Oriented Database Applications in SQL by Richard T. Snodgrass, section 6.3.1 he demonstrates how to perform a sequenced inner join on a transaction-time state table, but does not discuss how to perform a sequenced left outer join.

Initial setup

Suppose the following events occur:

  • 2025-01-01 - User 0 (John) is created. Order 0 (for User 0) is created. Order 1 (for User 1) is created.
  • 2025-01-02 - User 1 (Mary) is created.
  • 2025-01-03 - User 0 (Jon) is renamed.
  • 2025-01-04 - Order 0 is moved from User 0 (Jon) to User 1 (Mary).

The resulting transaction-time state tables are these:

Note: I have deliberately combined the history and current table, unlike the official SQL server temporal tables feature which has them separated. This is similar to querying temporal tables with FOR SYSTEM_TIME ALL when using the official temporal table feature in SQL Server 2016

Orders

Id UserId ValidFrom ValidTo
0 0 2025-01-01 2025-01-04
0 1 2025-01-04 9999-12-31
1 1 2025-01-01 9999-12-31

Users

Id Name ValidFrom ValidTo
0 John 2025-01-01 2025-01-03
0 Jon 2025-01-03 9999-12-31
1 Mary 2025-01-02 9999-12-31

DDL/DML for Copy Paste

CREATE TABLE Orders (
    Id INT,
    UserId INT ,
    ValidFrom DATE,
    ValidTo DATE,
);

INSERT INTO Orders
VALUES (0, 0, '2025-01-01', '2025-01-04'),
       (0, 1, '2025-01-04', '9999-12-31'),
       (1, 1, '2025-01-01', '9999-12-31');

CREATE TABLE Users (
    Id INT,
    Name VARCHAR(50),
    ValidFrom DATE,
    ValidTo DATE
);

INSERT INTO Users
VALUES (0, 'John', '2025-01-01', '2025-01-03'),
       (0, 'Jon', '2025-01-03', '9999-12-31'),
       (1, 'Mary', '2025-01-02', '9999-12-31');

Now, I can use the following query to get all the current orders, and the associated user name:

SELECT O.Id, U.Name 
FROM Orders O
LEFT JOIN Users U ON U.Id = O.UserId
WHERE O.ValidTo = '9999-12-31' 
  AND U.ValidTo = '9999-12-31'

Results

O.Id U.Name
0 Mary
1 Mary

Question

I want to see how the results of the above query have changed over time. What query should I write to get the results shown below?

Desired Results

O.Id U.Name ValidFrom ValidTo Note (for reference, not actually part of the query)
0 John 2025-01-01 2025-01-03 User 0 + Order 0 are created
1 NULL 2025-01-01 2025-01-02 Order 1 is created
1 Mary 2025-01-02 9999-12-31 User 1 is created
0 Jon 2025-01-03 2025-01-04 User 0 is renamed
0 Mary 2025-01-04 9999-12-31 Order 0 is moved from User 0 to User 1
like image 452
Rubydesic Avatar asked Nov 01 '25 02:11

Rubydesic


1 Answers

As coded in your attempt, the LEFT JOIN is not working as intended, because once it matches a row (having an overlapping date range), it will not also supply a null match for the non-overlapping case.

One solution would be to use a CTE (common table expression) to generate additional User rows for the since beginning-of-time cases and UNION them with the actual source rows. If you ever have user IDs having no rows extending to the end-of-time, you might also need to generate rows for those cases.

A standard test for overlapping date ranges (having exclusive end dates) is Start1 < End2 AND Start2 < End1. This can be included in your join condition. The resulting overlap date range can then be calculated as GREATEST(start1, start2) and LEAST(end1, end2).

The following query should suit your needs:

WITH ExtendedUsers AS (
    SELECT Id, Name, ValidFrom, ValidTo
    FROM Users
    UNION
    -- Beginning of time extensions
    SELECT Id, NULL, '1900-01-01' AS ValidFrom, MIN(ValidFrom) AS ValidTo
    FROM Users
    GROUP BY Id
)
SELECT
    O.Id, U.Name,
    GREATEST(O.ValidFrom, U.ValidFrom) ValidFrom,
    LEAST(O.ValidTo, U.ValidTo) ValidTo
FROM Orders O
LEFT JOIN ExtendedUsers U
    ON U.Id = O.UserId
    AND U.ValidFrom < O.ValidTo
    AND O.ValidFrom < U.ValidTo
ORDER BY ValidFrom, O.Id;

The LEFT JOIN will still handle the case where there are no matching User rows. If this situation never occurs, the query can be changed to use an inner join.

Results:

Id Name ValidFrom ValidTo
0 John 2025-01-01 2025-01-03
1 null 2025-01-01 2025-01-02
1 Mary 2025-01-02 9999-12-31
0 Jon 2025-01-03 2025-01-04
0 Mary 2025-01-04 9999-12-31

See this db<>query for a demo.

If you need to generate activity notes (and they weren't just included in your post for reference), you can:

  1. Use the OUTER APPLY(SELECT TOP 1 ...) pattern to select the prior active Order and User rows,
  2. Generate notes based on various tests,
  3. Combine those notes into a single value using STRING_AGG(), and
  4. Add the combined notes to your final select list.
WITH ...
SELECT ..., N.Notes
FROM ...
OUTER APPLY (
    SELECT TOP 1 O2.*
    FROM Orders O2
    WHERE O2.Id = O.Id
    AND O2.ValidFrom < GREATEST(O.ValidFrom, U.ValidFrom)
    ORDER BY O2.ValidFrom DESC
) OPrior
OUTER APPLY (
    SELECT TOP 1 U2.*
    FROM Users U2
    WHERE U2.Id = U.Id
    AND U2.ValidFrom < GREATEST(O.ValidFrom, U.ValidFrom)
    ORDER BY U2.ValidFrom DESC
) UPrior
CROSS APPLY (
    SELECT STRING_AGG(N1.Note, ', ') WITHIN GROUP(ORDER BY Seq) AS Notes
    FROM (
        SELECT 'Order Created' AS Note, 1 AS Seq
        WHERE OPrior.Id IS NULL
        UNION ALL
        SELECT 'User Reassigned' AS Note, 2 AS Seq
        WHERE OPrior.UserId <> O.UserId
        UNION ALL
        SELECT 'User Created' AS Note, 3 AS Seq
        WHERE UPrior.Id IS NULL
        AND U.ValidFrom > '1900-01-01'
        UNION ALL
        SELECT 'User Renamed' AS Note, 4 AS Seq
        WHERE UPrior.Id = U.Id
        AND UPrior.Name <> U.Name
    ) N1
) N
...;

Results:

Id Name ValidFrom ValidTo Notes
0 John 2025-01-01 2025-01-03 Order Created, User Created
1 null 2025-01-01 2025-01-02 Order Created
1 Mary 2025-01-02 9999-12-31 User Created
0 Jon 2025-01-03 2025-01-04 User Renamed
0 Mary 2025-01-04 9999-12-31 User Reassigned

See this db<>fiddle.

I'll leave the specifics of formatting the notes with additional details to the OP.

For the above to work efficiently, I would recommend indexes on the following:

  • Orders(Id, ValidFrom)
  • Users(Id, ValidFrom)

Side note: I would question how a database with any kind of referential integrity could create an Orders row containing a UserId value, where the initial Orders.ValidFrom pre-dates the initial Users.ValidFrom. If this situation did not exists, the need for the LEFT JOIN and dummy since beginning-of-time rows would not be an issue.

like image 166
T N Avatar answered Nov 02 '25 15:11

T N