Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to effectively do database as-of queries?

Excuse the long question!

We have two database tables, e.g. Car and Wheel. They are related in that a wheel belongs to a car and a car has multiple wheels. The wheels, however, can be changed without affecting the "version" of the car. The car's record can be updated (e.g. paint job) without affecting the version of the wheels (i.e. no cascade updating).

For example, Car table currently looks like this:

CarId, CarVer, VersionTime, Colour
   1      1       9:00       Red
   1      2       9:30       Blue
   1      3       9:45       Yellow
   1      4      10:00       Black

The Wheels table looks like this (this car only has two wheels!)

WheelId, WheelVer, VersionTime, CarId
   1         1           9:00     1
   1         2           9:40     1
   1         3          10:05     1
   2         1           9:00     1

So, there's been 4 versions of this two wheeled car. It's first wheel (WheelId 1) hasn't changed. The second wheel was changed (e.g. painted) at 10:05.

How do I efficiently do as of queries that can be joined to other tables as required? Note that this is a new database and we own the schema and can change it or add audit tables to make this query easier. We've tried one audit table approach (with columns: CarId, CarVersion, WheelId, WheelVersion, CarVerTime, WheelVerTime), but it didn't really improve our query.

Example query: Show the Car ID 1 as it was, including its wheel records as of 9:50. This query should result in these two rows being returned:

WheelId, WheelVer, WheelVerTime, CarId, CarVer, CarVerTime, CarColour
   1         2         9:40        1       3       9:45      Yellow
   2         1         9:00        1       3       9:45      Yellow

The best query we could come up with was this:

select c.CarId, c.VersionTime, w.WheelId,w.WheelVer,w.VersionTime,w.CarId
from Cars c, 
(    select w.WheelId,w.WheelVer,w.VersionTime,w.CarId
    from Wheels w
    where w.VersionTime <= "12 Jun 2009 09:50" 
     group by w.WheelId,w.CarId
     having w.WheelVer = max(w.WheelVer)
) w
where c.CarId = w.CarId
and c.CarId = 1
and c.VersionTime <= "12 Jun 2009 09:50" 
group by c.CarId, w.WheelId,w.WheelVer,w.VersionTime,w.CarId
having c.CarVer = max(c.CarVer)

And, if you wanted to try this then the create table and insert record SQL is here:

create table Wheels
(
WheelId int not null,
WheelVer int not null,
VersionTime datetime not null,
CarId int not null,
 PRIMARY KEY  (WheelId,WheelVer)
)
go

insert into Wheels values (1,1,'12 Jun 2009 09:00', 1)
go
insert into Wheels values (1,2,'12 Jun 2009 09:40', 1)
go
insert into Wheels values (1,3,'12 Jun 2009 10:05', 1)
go
insert into Wheels values (2,1,'12 Jun 2009 09:00', 1)
go


create table Cars
(
CarId int not null,
CarVer int not null,
VersionTime datetime not null,
colour varchar(50) not null,
 PRIMARY KEY  (CarId,CarVer)
)
go

insert into Cars values (1,1,'12 Jun 2009 09:00', 'Red')
go
insert into Cars values (1,2,'12 Jun 2009 09:30',  'Blue')
go
insert into Cars values (1,3,'12 Jun 2009 09:45',  'Yellow')
go
insert into Cars values (1,4,'12 Jun 2009 10:00',  'Black')
go
like image 321
ng5000 Avatar asked Feb 28 '23 15:02

ng5000


2 Answers

This kind of table is known as a valid-time state table in the literature. It is universally accepted that each row should model a period by having a start date and an end date. Basically, the unit of work in SQL is the row and a row should completely define the entity; by having just one date per row, not only do your queries become more complex, your design is compromised by splitting sub atomic parts on to different rows.

As mentioned by Erwin Smout, one of the definitive books on the subject is:

Richard T. Snodgrass (1999). Developing Time-Oriented Database Applications in SQL

It's out of print but happily is available as a free download PDF (link above).

I have actually read it and have implemented many of the concepts. Much of the text is in ISO/ANSI Standard SQL-92 and although some have been implemented in proprietary SQL syntaxes, including SQL Server (also available as downloads) I found the conceptual information much more useful.

Joe Celko also has a book, 'Thinking in Sets: Auxiliary, Temporal, and Virtual Tables in SQL', largely derived from Snodgrass's work, though I have to say where the two diverge I find Snodgrass's approaches preferable.

I concur this stuff is hard to implement in the SQL products we currently have. We think long and hard before making data temporal; if we can get away with merely 'historical' then we will. Much of the temporal functionality in SQL-92 is missing from SQL Server e.g. INTERVAL, OVERLAPS, etc. Some things as fundamental as sequenced 'primary keys' to ensure periods do not overlap cannot be implemented using CHECK constraints in SQL Server, necessitating triggers and/or UDFs.

Snodgrass's book is based on his work for SQL3, a proposed extension to Standard SQL to provide much better support for temporal databases, though sadly this seems to have been effectively shelved years ago :(

like image 131
onedaywhen Avatar answered Mar 08 '23 08:03

onedaywhen


As-of queries are easier when each row has a start and an end time. Storing the end time in the table would be most efficient, but if this is hard, you can query it like:

select 
    ThisCar.CarId
,   StartTime = ThisCar.VersionTime
,   EndTime = NextCar.VersionTime
from Cars ThisCar
left join Cars NextCar
    on NextCar.CarId = ThisCar.CarId
    and ThisCar.VersionTime < NextCar.VersionTime
left join Cars BetweenCar
    on BetweenCar.CarId = BetweenCar.CarId
    and ThisCar.VersionTime < BetweenCar.VersionTime
    and BetweenCar.VersionTime < NextCar.VersionTime
where BetweenCar.CarId is null

You can store this in a view. Say the view is called vwCars, you can select a car for a particular date like:

select * 
from vwCars
where StartTime <= '2009-06-12 09:15' 
and ('2009-06-12 09:15' < EndTime or EndTime is null)

You could store this in a table valued stored procedure, but that might have a steep performance penalty.

like image 23
Andomar Avatar answered Mar 08 '23 08:03

Andomar