TL;DR In the book Developing Time-Oriented Database Applications in SQL by Richard T. Snodgrass, section 6.3.1 he demonstrates how to perform a sequenced inner join on a transaction-time state table, but does not discuss how to perform a sequenced left outer join.
Suppose the following events occur:
The resulting transaction-time state tables are these:
Note: I have deliberately combined the history and current table, unlike the official SQL server temporal tables feature which has them separated. This is similar to querying temporal tables with FOR SYSTEM_TIME ALL when using the official temporal table feature in SQL Server 2016
Orders
| Id | UserId | ValidFrom | ValidTo |
|---|---|---|---|
| 0 | 0 | 2025-01-01 | 2025-01-04 |
| 0 | 1 | 2025-01-04 | 9999-12-31 |
| 1 | 1 | 2025-01-01 | 9999-12-31 |
Users
| Id | Name | ValidFrom | ValidTo |
|---|---|---|---|
| 0 | John | 2025-01-01 | 2025-01-03 |
| 0 | Jon | 2025-01-03 | 9999-12-31 |
| 1 | Mary | 2025-01-02 | 9999-12-31 |
DDL/DML for Copy Paste
CREATE TABLE Orders (
Id INT,
UserId INT ,
ValidFrom DATE,
ValidTo DATE,
);
INSERT INTO Orders
VALUES (0, 0, '2025-01-01', '2025-01-04'),
(0, 1, '2025-01-04', '9999-12-31'),
(1, 1, '2025-01-01', '9999-12-31');
CREATE TABLE Users (
Id INT,
Name VARCHAR(50),
ValidFrom DATE,
ValidTo DATE
);
INSERT INTO Users
VALUES (0, 'John', '2025-01-01', '2025-01-03'),
(0, 'Jon', '2025-01-03', '9999-12-31'),
(1, 'Mary', '2025-01-02', '9999-12-31');
Now, I can use the following query to get all the current orders, and the associated user name:
SELECT O.Id, U.Name
FROM Orders O
LEFT JOIN Users U ON U.Id = O.UserId
WHERE O.ValidTo = '9999-12-31'
AND U.ValidTo = '9999-12-31'
Results
| O.Id | U.Name |
|---|---|
| 0 | Mary |
| 1 | Mary |
I want to see how the results of the above query have changed over time. What query should I write to get the results shown below?
| O.Id | U.Name | ValidFrom | ValidTo | Note (for reference, not actually part of the query) |
|---|---|---|---|---|
| 0 | John | 2025-01-01 | 2025-01-03 | User 0 + Order 0 are created |
| 1 | NULL | 2025-01-01 | 2025-01-02 | Order 1 is created |
| 1 | Mary | 2025-01-02 | 9999-12-31 | User 1 is created |
| 0 | Jon | 2025-01-03 | 2025-01-04 | User 0 is renamed |
| 0 | Mary | 2025-01-04 | 9999-12-31 | Order 0 is moved from User 0 to User 1 |
As coded in your attempt, the LEFT JOIN is not working as intended, because once it matches a row (having an overlapping date range), it will not also supply a null match for the non-overlapping case.
One solution would be to use a CTE (common table expression) to generate additional User rows for the since beginning-of-time cases and UNION them with the actual source rows. If you ever have user IDs having no rows extending to the end-of-time, you might also need to generate rows for those cases.
A standard test for overlapping date ranges (having exclusive end dates) is Start1 < End2 AND Start2 < End1. This can be included in your join condition. The resulting overlap date range can then be calculated as GREATEST(start1, start2) and LEAST(end1, end2).
The following query should suit your needs:
WITH ExtendedUsers AS (
SELECT Id, Name, ValidFrom, ValidTo
FROM Users
UNION
-- Beginning of time extensions
SELECT Id, NULL, '1900-01-01' AS ValidFrom, MIN(ValidFrom) AS ValidTo
FROM Users
GROUP BY Id
)
SELECT
O.Id, U.Name,
GREATEST(O.ValidFrom, U.ValidFrom) ValidFrom,
LEAST(O.ValidTo, U.ValidTo) ValidTo
FROM Orders O
LEFT JOIN ExtendedUsers U
ON U.Id = O.UserId
AND U.ValidFrom < O.ValidTo
AND O.ValidFrom < U.ValidTo
ORDER BY ValidFrom, O.Id;
The LEFT JOIN will still handle the case where there are no matching User rows. If this situation never occurs, the query can be changed to use an inner join.
Results:
| Id | Name | ValidFrom | ValidTo |
|---|---|---|---|
| 0 | John | 2025-01-01 | 2025-01-03 |
| 1 | null | 2025-01-01 | 2025-01-02 |
| 1 | Mary | 2025-01-02 | 9999-12-31 |
| 0 | Jon | 2025-01-03 | 2025-01-04 |
| 0 | Mary | 2025-01-04 | 9999-12-31 |
See this db<>query for a demo.
If you need to generate activity notes (and they weren't just included in your post for reference), you can:
OUTER APPLY(SELECT TOP 1 ...) pattern to select the prior active Order and User rows,STRING_AGG(), andWITH ...
SELECT ..., N.Notes
FROM ...
OUTER APPLY (
SELECT TOP 1 O2.*
FROM Orders O2
WHERE O2.Id = O.Id
AND O2.ValidFrom < GREATEST(O.ValidFrom, U.ValidFrom)
ORDER BY O2.ValidFrom DESC
) OPrior
OUTER APPLY (
SELECT TOP 1 U2.*
FROM Users U2
WHERE U2.Id = U.Id
AND U2.ValidFrom < GREATEST(O.ValidFrom, U.ValidFrom)
ORDER BY U2.ValidFrom DESC
) UPrior
CROSS APPLY (
SELECT STRING_AGG(N1.Note, ', ') WITHIN GROUP(ORDER BY Seq) AS Notes
FROM (
SELECT 'Order Created' AS Note, 1 AS Seq
WHERE OPrior.Id IS NULL
UNION ALL
SELECT 'User Reassigned' AS Note, 2 AS Seq
WHERE OPrior.UserId <> O.UserId
UNION ALL
SELECT 'User Created' AS Note, 3 AS Seq
WHERE UPrior.Id IS NULL
AND U.ValidFrom > '1900-01-01'
UNION ALL
SELECT 'User Renamed' AS Note, 4 AS Seq
WHERE UPrior.Id = U.Id
AND UPrior.Name <> U.Name
) N1
) N
...;
Results:
| Id | Name | ValidFrom | ValidTo | Notes |
|---|---|---|---|---|
| 0 | John | 2025-01-01 | 2025-01-03 | Order Created, User Created |
| 1 | null | 2025-01-01 | 2025-01-02 | Order Created |
| 1 | Mary | 2025-01-02 | 9999-12-31 | User Created |
| 0 | Jon | 2025-01-03 | 2025-01-04 | User Renamed |
| 0 | Mary | 2025-01-04 | 9999-12-31 | User Reassigned |
See this db<>fiddle.
I'll leave the specifics of formatting the notes with additional details to the OP.
For the above to work efficiently, I would recommend indexes on the following:
Orders(Id, ValidFrom)Users(Id, ValidFrom)Side note: I would question how a database with any kind of referential integrity could create an Orders row containing a UserId value, where the initial Orders.ValidFrom pre-dates the initial Users.ValidFrom. If this situation did not exists, the need for the LEFT JOIN and dummy since beginning-of-time rows would not be an issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With