I'm building Django app intended to maintain timetable(schedule) for 10K+ persons(and more in the future). Basically, problem statement is following: each person has separate timetable with free slots for the next year. It is discrete with step of 15 minutes. I need to develop models architecture(which will imply database design underneath) to do the following:
For example, I have John, who is free from 8AM-14PM 14 November and Sara who is free from 10AM to 11AM 14 November. If I query John's free time slots I want to get "8AM-14PM 14 November". If I query "free persons from 8AM to 11AM" I get John as Sara is not free until 10AM. If I query "free persons from 10AM to 11AM" I want to get both John and Sara. I thought about this problem and my ideas are below.
Solution №1: We create a model FreeTimeSlot that would store information about every 15 minute time span interval and build throughout relation to it from person.
class Person(models.Model):
name = models.CharField(max_length=32, null=False, blank=False)
free_slots = models.ManyToManyField(FreeTimeSlot, related_name='tutor_set', null=True, blank=True, through='PersonSlot')
class TimeSlot(models.Model):
time = models.DateTimeField(db_index=True) #perhaps other field type
class PersonSlot(models.Model):
person = models.ForeignKey(Person)
timeslot = models.ForeignKey(Slot)
class Meta:
db_table = 'person_free_slots'
unique_together = (('timeslot', 'person'))
We create 365*24*4 TimeSlot models for every 15 minute interval in the upcoming year, and if person indicates in his schedule free time we add relation to that TimeSlot. With this kind of architecture getting free timeslots for person is as simple as going through manager: person.free_time_slots Getting all persons free at a certain time (e.g. 10-10:45) is also quite easy, smoothing like:
timeslots = TimeSlot.objects.filter(time__in=['10:00', '10:15', '10:30'])
PersonSlot.objects.filter(timeslot__in=timeslots).values('person')
Solution №2: We avoid creating model for every timeslot but keep date in PersonTime model itself:
class Person(models.Model):
name = models.CharField(max_length=32, null=False, blank=False)
class TimeSlot(models.Model):
person = models.ForeignKey(Person, related_name='slots')
time_start = models.DateTimeField(db_index=True)
time_end = models.DateTimeField(db_index=True)
Getting list of free time slots is also easy(person.slots). Getting all persons free at a certain time (e.g. 10-10:45) would be like:
TimeSlot.objects.filter(time_start__gte="10:00", time_end__lte="10:45").values('person')
This solution will not work with intersecting intervals and I'm not sure if querying on indexed time for interval(using gte and lte comparisons on the same field) will work, and will work fast. I use Postgres if it can matter. I also wrote time queries in pseudocode to make code simpler.
So my question is the following, how good django developers would implement this functionality to provide speed for both queries on big data? I would appreciate an advice on possible caveats/benefits to my current solutions or new ideas.
Let's split this question into 2 parts.
Consider encoding the data related to time slots. If you need a precision of 15 minutes, you have 96 slots (4 slots in 1h * 24h in a day) with the duration of 15 minutes on any given day. Each slot can have one of two possible states: 1 - the slot is free, 0 - the slot is busy (or vice-versa, if you wish). Thus, you can represent a daily timetable with a string of 0
s and 1
s. For instance, the string (spaces added only for easily reading purpose) 0000 0000 0000 0000 0000 0000 0000 0000 0000 1110 0000 ...
represents a busy time slot between 00:00AM and 09:00AM (nobody works at night), followed by a free time slot between 9:00AM and 9:45AM (three 1
s in a row), followed by a busy time slot starting from 9:45AM.
So, you can write your models like this:
class Person(models.Model):
name = models.CharField(max_length=32)
class DailySchedule(models.Model):
person = models.ForeignKey(Person, related_name='day_schedule')
date = models.DateField()
schedule = models.CharField(max_length=96)
So, we encoded the information about available/busy time slots, but how can we extract it from the database? Luckily, Django has regex
field lookup possibility! And double luckily, it is supported in Django 1.4!!
Thus, in order to find who's available within a specific time slot, you can use DailySchedule.objects.filter(date=date, schedule__regex=r'<expression>')
. As it is not obvious what expression to use to extract different timeframes, we'll need a support function:
def time_slot_to_regex(start_time, end_time):
# times should be in HH:MM format
start_hour, start_minutes = start_time.split(':')
end_hour, end_minutes = end_time.split(':')
slots_before_needed_time = (int(start_hour)*4 + int(start_minutes)/15)
# compute how many hours are between given times and find out nr of slots
hour_duration_slots = (int(end_hour) - int(start_hour)) * 4 # 4 slots in each hour
# adjust nr of slots according to minutes in provided times.
# e.g. 9:30 to 10:45 - we have 10-9=1 hour, which is 4 time slots,
# but we need to subtract 2 time slots, because we don't have 9:00 to 10:00,
# but 9:30 to 10:00 so we subtract 30/15=2 timeslots and add what is left
# from the incomplete hour of 10:45 time, which is 45/15 minutes = 3 slots
minute_duration_slots = int(end_minutes)/15 - int(start_minutes)/15
total_duration = hour_duration_slots + minute_duration_slots
regular_expression = r'^[01]{%d}1{%d}' % (slots_before_needed_time, total_duration)
return regular_expression
Let's figure out how this function works
Suppose we want to find out who are available between 9:15 and 9:45AM. We call slots_expression = time_slot_to_regex('9:15', '9:45')
which computes:
slots_before_needed_time = 37
, which we got by multiplying 9 by 4 + 15/15. This is the number of slots that we don't care about, which will go into the first part of our regular_expression
string - '^[01]{37}'
hour_duration_slots = 0
, because the hour in both time values is the sameminute_duration_slots = 2
, which we got by subtracting 15/15 from 45/15regular_expression
, thus obtaining '^[01]{37}1{2}'
Now we can provide this regular expression to our filter, obtaining DailySchedule.objects.filter(schedule__regex=slots_expression)
and voila!, we have the result.
I've described the principle of encoding data, but did not mention the process to encode it. This can be done easily by using another support function that takes a string of existing busy/available slots and a start_date
and end_date
for which to update the existing schedule. If you need this method described as well, please let me know.
^[01]{36}1{4}[01]{24}1{4}
will look for people who are available from 9AM to 10AM and from 4PM to 5PMIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With