Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Architecture for Django models to implement Timetable(scheduling) functionality

I'm building Django app intended to maintain timetable(schedule) for 10K+ persons(and more in the future). Basically, problem statement is following: each person has separate timetable with free slots for the next year. It is discrete with step of 15 minutes. I need to develop models architecture(which will imply database design underneath) to do the following:

  1. Query all free time slots for given person.
  2. Query all persons who are free during a certain time.

For example, I have John, who is free from 8AM-14PM 14 November and Sara who is free from 10AM to 11AM 14 November. If I query John's free time slots I want to get "8AM-14PM 14 November". If I query "free persons from 8AM to 11AM" I get John as Sara is not free until 10AM. If I query "free persons from 10AM to 11AM" I want to get both John and Sara. I thought about this problem and my ideas are below.

Solution №1: We create a model FreeTimeSlot that would store information about every 15 minute time span interval and build throughout relation to it from person.

    class Person(models.Model):
        name = models.CharField(max_length=32, null=False, blank=False)
        free_slots = models.ManyToManyField(FreeTimeSlot, related_name='tutor_set', null=True, blank=True, through='PersonSlot')

    class TimeSlot(models.Model):
        time = models.DateTimeField(db_index=True) #perhaps other field type

    class PersonSlot(models.Model):
        person = models.ForeignKey(Person)
        timeslot = models.ForeignKey(Slot)

        class Meta:
            db_table = 'person_free_slots'
            unique_together = (('timeslot', 'person'))

We create 365*24*4 TimeSlot models for every 15 minute interval in the upcoming year, and if person indicates in his schedule free time we add relation to that TimeSlot. With this kind of architecture getting free timeslots for person is as simple as going through manager: person.free_time_slots Getting all persons free at a certain time (e.g. 10-10:45) is also quite easy, smoothing like:

timeslots = TimeSlot.objects.filter(time__in=['10:00', '10:15', '10:30'])
PersonSlot.objects.filter(timeslot__in=timeslots).values('person')

Solution №2: We avoid creating model for every timeslot but keep date in PersonTime model itself:

    class Person(models.Model):
        name = models.CharField(max_length=32, null=False, blank=False)

    class TimeSlot(models.Model):
        person = models.ForeignKey(Person, related_name='slots')
        time_start = models.DateTimeField(db_index=True)
        time_end = models.DateTimeField(db_index=True)

Getting list of free time slots is also easy(person.slots). Getting all persons free at a certain time (e.g. 10-10:45) would be like:

TimeSlot.objects.filter(time_start__gte="10:00", time_end__lte="10:45").values('person')

This solution will not work with intersecting intervals and I'm not sure if querying on indexed time for interval(using gte and lte comparisons on the same field) will work, and will work fast. I use Postgres if it can matter. I also wrote time queries in pseudocode to make code simpler.

So my question is the following, how good django developers would implement this functionality to provide speed for both queries on big data? I would appreciate an advice on possible caveats/benefits to my current solutions or new ideas.

like image 806
Dmytriy Voloshyn Avatar asked Nov 13 '15 20:11

Dmytriy Voloshyn


1 Answers

Let's split this question into 2 parts.

Part 1 - Data encoding

Consider encoding the data related to time slots. If you need a precision of 15 minutes, you have 96 slots (4 slots in 1h * 24h in a day) with the duration of 15 minutes on any given day. Each slot can have one of two possible states: 1 - the slot is free, 0 - the slot is busy (or vice-versa, if you wish). Thus, you can represent a daily timetable with a string of 0s and 1s. For instance, the string (spaces added only for easily reading purpose) 0000 0000 0000 0000 0000 0000 0000 0000 0000 1110 0000 ... represents a busy time slot between 00:00AM and 09:00AM (nobody works at night), followed by a free time slot between 9:00AM and 9:45AM (three 1s in a row), followed by a busy time slot starting from 9:45AM.

So, you can write your models like this:

class Person(models.Model):
    name = models.CharField(max_length=32)

class DailySchedule(models.Model):
    person = models.ForeignKey(Person, related_name='day_schedule')
    date = models.DateField()
    schedule = models.CharField(max_length=96)

Part 2 - Querying

So, we encoded the information about available/busy time slots, but how can we extract it from the database? Luckily, Django has regex field lookup possibility! And double luckily, it is supported in Django 1.4!!

Thus, in order to find who's available within a specific time slot, you can use DailySchedule.objects.filter(date=date, schedule__regex=r'<expression>'). As it is not obvious what expression to use to extract different timeframes, we'll need a support function:

def time_slot_to_regex(start_time, end_time):
    # times should be in HH:MM format
    start_hour, start_minutes = start_time.split(':')
    end_hour, end_minutes = end_time.split(':')

    slots_before_needed_time = (int(start_hour)*4 + int(start_minutes)/15)

    # compute how many hours are between given times and find out nr of slots
    hour_duration_slots = (int(end_hour) - int(start_hour)) * 4  # 4 slots in each hour

    # adjust nr of slots according to minutes in provided times. 
    # e.g. 9:30 to 10:45 - we have 10-9=1 hour, which is 4 time slots, 
    # but we need to subtract 2 time slots, because we don't have 9:00 to 10:00, 
    # but 9:30 to 10:00 so we subtract 30/15=2 timeslots and add what is left 
    # from the incomplete hour of 10:45 time, which is 45/15 minutes = 3 slots
    minute_duration_slots = int(end_minutes)/15 - int(start_minutes)/15

    total_duration = hour_duration_slots + minute_duration_slots

    regular_expression = r'^[01]{%d}1{%d}' % (slots_before_needed_time, total_duration)

    return regular_expression

Let's figure out how this function works

Suppose we want to find out who are available between 9:15 and 9:45AM. We call slots_expression = time_slot_to_regex('9:15', '9:45') which computes:

  • slots_before_needed_time = 37, which we got by multiplying 9 by 4 + 15/15. This is the number of slots that we don't care about, which will go into the first part of our regular_expression string - '^[01]{37}'
  • hour_duration_slots = 0, because the hour in both time values is the same
  • minute_duration_slots = 2, which we got by subtracting 15/15 from 45/15
  • previous 2 added together gives us 2 slots that we need to set to 1 in our regular_expression, thus obtaining '^[01]{37}1{2}'

Now we can provide this regular expression to our filter, obtaining DailySchedule.objects.filter(schedule__regex=slots_expression) and voila!, we have the result.

Saving data process

I've described the principle of encoding data, but did not mention the process to encode it. This can be done easily by using another support function that takes a string of existing busy/available slots and a start_date and end_date for which to update the existing schedule. If you need this method described as well, please let me know.

Pros

  • no M2M relationship, which results in faster queries
  • can search for several free time slots within a day, using a different regular expression (for instance ^[01]{36}1{4}[01]{24}1{4} will look for people who are available from 9AM to 10AM and from 4PM to 5PM
  • relatively simple implementation
  • as a side effect, you'll have easier access to find out the busy time slots as you'll have the required information in the database and won't need to run computations for it

Cons

  • not verbose and for some it may be confusing
  • requires more space in the database, as you are going to save the free as well as the busy time slots
like image 155
iulian Avatar answered Sep 30 '22 11:09

iulian