I would like to implement a dynamic multiple timeline queue. The context here is scheduling in general.
This is still simple: It is a timeline of tasks, where each event has its start and end time. Tasks are grouped as jobs. This group of tasks need to preserve its order, but can be moved around in time as a whole. For example it could be represented as:
--t1-- ---t2.1-----------t2.2-------
' ' ' ' '
20 30 40 70 120
I would implement this as a heap queue with some additional constraints. The Python sched
module has some basic approaches in this direction.
One queue stands for a resource and a resource is needed by a task. Graphical example:
R1 --t1.1----- --t2.2----- -----t1.3--
/ \ /
R2 --t2.1-- ------t1.2-----
It becomes interesting when a task can use one of multiple resources. An additional constraint is that consecutive tasks, which can run on the same resource, must use the same resource.
Example: If (from above) task t1.3
can run on R1
or R2
, the queue should look like:
R1 --t1.1----- --t2.2-----
/ \
R2 --t2.1-- ------t1.2----------t1.3--
start
where there is free time for duration
(see detailed explanation at the end).FirstFreeSlot
.The point is: How can I represent this information to provide the functionality efficiently? Implementation is up to me ;-)
Update: A further point to consider: The typical interval structures have the focus on "What is at point X?" But in this case the enqueue
and therefore the question "Where is the first empty slot for duration D?" is much more important. So a segment/interval tree or something else in this direction is probably not the right choice.
To elaborate the point with the free slots further: Due to the fact that we have multiple resources and the constraint of grouped tasks there can be free time slots on some resources. Simple example: t1.1
run on R1 for 40 and then t1.2
run on R2. So there is an empty interval of [0, 40]
on R2 which can be filled by the next job.
Update 2: There is an interesting proposal in another SO question. If someone can port it to my problem and show that it is working for this case (especially elaborated to multiple resources), this would be probably a valid answer.
Let's restrict ourselves to the simplest case first: Find a suitable data structure that allows for a fast implementation of FirstFreeSlot().
The free time slots live in a two-dimensional space: One dimension is the start time s, the other is the length d. FirstFreeSlot(D) effectively answers the following query:
min s: d >= D
If we think of s and d as a cartesian space (d=x, s=y), this means finding the lowest point in a subplane bounded by a vertical line. A quad-tree, perhaps with some auxiliary information in each node (namely, min s over all leafs), will help answering this query efficiently.
For Enqueue() in the face of resource constraints, consider maintaining a separate quad-tree for each resource. The quad tree can also answer queries like
min s: s >= S & d >= D
(required for restricting the start data) in a similar fashion: Now a rectangle (open at the top left) is cut off, and we look for min s in that rectangle.
Put() and Delete() are simple update operations for the quad-tree.
Recalculate() can be implemented by Delete() + Put(). In order to save time for unnecessary operations, define sufficient (or, ideally, sufficient + necessary) conditions for triggering a recalculation. The Observer pattern might help here, but remember putting the tasks for rescheduling into a FIFO queue or a priority queue sorted by start time. (You want to finish rescheduling the current task before taking over to the next.)
On a more general note, I'm sure you are aware that most kind of scheduling problems, especially those with resource constraints, are NP-complete at least. So don't expect an algorithm with a decent runtime in the general case.
class Task:
name=''
duration=0
resources=list()
class Job:
name=''
tasks=list()
class Assignment:
task=None
resource=None
time=None
class MultipleTimeline:
assignments=list()
def enqueue(self,job):
pass
def put(self,job):
pass
def delete(self,job):
pass
def recalculate(self):
pass
Is this a first step in the direction you are looking for, i.e. a data model written out in Python?
Update:
Hereby my more efficient model:
It basicly puts all Tasks in a linked list ordered by endtime.
class Task:
name=''
duration=0 # the amount of work to be done
resources=0 # bitmap that tells what resources this task uses
# the following variables are only used when the task is scheduled
next=None # the next scheduled task by endtime
resource=None # the resource this task is scheduled
gap=None # the amount of time before the next scheduled task starts on this resource
class Job:
id=0
tasks=list() # the Task instances of this job in order
class Resource:
bitflag=0 # a bit flag which operates bitwisely with Task.resources
firsttask=None # the first Task instance that is scheduled on this resource
gap=None # the amount of time before the first Task starts
class MultipleTimeline:
resources=list()
def FirstFreeSlot():
pass
def enqueue(self,job):
pass
def put(self,job):
pass
def delete(self,job):
pass
def recalculate(self):
pass
Because of the updates by enqueue
and put
I decided not to use trees.
Because of put
which moves tasks in time I decided not to use absolute times.
FirstFreeSlot
not only returns the task with the free slot but also the other running tasks with their endtimes.
enqueue
works as follows:
We look for a free slot by FirstFreeSlot
and schedule the task here.
If there is enough space for the next task we can schedule it in too.
If not: look at the other tasks running if they have free space.
If not: run FirstFreeSlot
with parameters of this time and running tasks.
improvements:
if put
is not used very often and enqueue
is done from time zero we could keep track of the overlapping tasks by including a dict() per tasks that contains the other running tasks. Then it is also easy to keep a list() per Resource which contains the scheduled tasks with absolute time for this Resource ordered by endtime. Only those tasks are included that have bigger timegaps than before. Now we can easier find a free slot.
Questions:
Do Tasks scheduled by put
need to be executed at that time?
If yes: What if another task to be scheduled by put overlaps?
Do all resources execute a task as fast?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With