Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data structure for non-overlapping ranges within a single dimension

I need a data structure that can store non-overlapping ranges within a single dimension. The entire range of the dimension need not be completely covered.

An example would be a conference room scheduler. The dimension is time. No two schedules may overlap. The conference room isn't always scheduled. In other words, for a given time there can be at most one schedule.

A quick solution is for a range to store the start and end times.

Range {
    Date start
    Date end
}

This is non-normalized and requires the container to enforce no overlapping. For two adjacent ranges, the previous' end will be redundant with the next's start.

Another scheme might involve storing one boundary value with each range. But for a contiguous sequence of ranges, there will always be one more boundary values than ranges. To get around this the sequence could be represented as alternating boundary values and ranges:

B = boundary value, r = range

B-r-B-r-B

The data structure might look like:

Boundary {
    Date value
    Range prev
    Range next
}

Range {
    Boundary start
    Boundary end
}

In essence it's a doubly linked list with alternating types.

Ultimately, whatever data structure I use will be represented in both memory (application code) and a relational database.

I'm curious what academic or industry tried solutions exists.

like image 928
Steve Kuo Avatar asked Oct 17 '08 00:10

Steve Kuo


2 Answers

The normalized way to represent your data would be to store a record for each unit of time. This can be done in the example of the conference scheduling application. Your constraint would be a unique constraint for

(RoomId, StartTime)

In the case of continuous ranges, you necessarily need to store 2 things, one boundary and either the second boundary or the length. It is usually done by storing the second boundary and then creating a constraint on both boundary of the kind

(boundary not between colBoudaryA and colBoundaryB)

with the additional constraint that

(startBoundary < endBoundary)
like image 156
Sklivvz Avatar answered Sep 21 '22 06:09

Sklivvz


A doubly linked list works well because you only use as much memory as you have filled ranges, and you need only check overlapping on insertions - it's almost trivial to do so at that point. If there's overlap the new item is rejected.

RoomID
ReservationID
PreviousReservationID
NextReservationID
StartTimeDate
EndTimeDate
Priority
UserID

The priority and UserID allow for schedules to have priorities (professor might have more clout than a student group) so that a new item can 'knock' the lower priority items out of the way during an insertion, and the UserID allows an email to be sent to the bumped meeting organizers.

You'll want to consider adding a table that points to the first meeting of each day so that searches can be optimized.

-Adam

like image 44
Adam Davis Avatar answered Sep 22 '22 06:09

Adam Davis