Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Schedule Searching in PHP/MySQL with templates and overrides

I'm looking for some advice/help on quite a complex search algorithm. Any articles to relevant techniques etc. would be much appreciated.

Background

I'm building an application, which, in a nutshell, allows users to set their "availability" for any given day. The User first sets a general availability template which allows them to say:

Monday - AM   
Tuesday - PM  
Wednesday - All Day  
Thursday - None  
Friday - All Day

So this User is generally available Monday AM, Tuesday PM etc.

Schema:

id  
user_id  
day_of_week  (1-7)(Monday to Sunday)
availability

They can then override specific dates manually, for example:

2013-03-03 - am  
2013-03-04 - pm  
2013-03-05 - all_day

Schema:

id
user_id
date
availability

This all works well - I have a Calendar being generated which combines the template and overrides and allows Users to modify their availability etc.

The Problem

I now need to allow Admin Users to search for Users who have specific availability. So the Admin User would use a calendar to select required dates and availability's and hit search.

For example, find me Users who are available:

2013-03-03 - pm
2013-03-04 - pm
2013-03-05 - pm

The search process would have to search for available Users using the Templated Availability and Overrides, then return the best results. Ideally, it would return Users who are available all of the time but in the case that no single user can match the dates, I need to provide a combination of Users who can.

I know this is quite a complex problem and I'm not looking for a complete answer, perhaps just some guidance or links to potentially relevant techniques etc.

What I've tried

At the moment, I have a halfway solution. I'm grabbing all the available Users, looping through each of them, and within that loop, looping through all of the required dates and breaking as soon as a User doesn't meet a required date. This is obviously very un-scalable and it's also only returning "perfect matches".

Possible Solutions

Full Text Searching with Aggregate Table

I thought about creating a separate table which had the following schema:

user_id
body

The body field would be populated with the Users template days and overrides so an example record might look like:

user_id: 2
body: monday_am tuesday_pm wednesday_pm thursday_am friday_allday 2013-03-03_all_day 2013-03-03_pm

I would then convert a Users search query into a similar format. So if a User was looking for someone who was available on the 19th March 2013 - All Day and 20th March 2013 - PM, I'd convert that into a string.

Firstly, as 19th March is a Tuesday, I'd convert that into tuesday_allday and same with the 20th. I'd therefore end up with:

tuesday_allday wednesday_pm 2013-03-19_allday 2013-03-20_pm

I'd then do a full text search against our aggregate table and return a "weighted" result set which I can then loop through and further interrogate.

I'm not sure how this would work in practice, so that's why I'm asking if anyone has any links to techniques or relevant articles I could use.

like image 509
Darren Taylor Avatar asked Mar 18 '13 17:03

Darren Taylor


1 Answers

I am confident this problem can be solved with a more well defined DB schema. By utilizing a more detailed DB schema you will be able to find any available user for any given time frame (not just am & pm) if you should so choose. It will also allow you to keep template data, while not polluting your availability data with template information (instead you would select from the template table to programmatically fill in the availability for a given date, which then can be modified by the user).

I spent some time diagramming this problem and came up with a schema structure that I believe solves the problem you specified and allows you to grow your application with a minimum of schema changes. (To make this easier to read I've added the SQL at the end of this proposed answer)

I have also included an example select statement that would allow you to pull availability data with any number of arguments. For clarity that SELECT is above the SQL for the schema @ the end of my explanatory text. Please don't be intimidated by the select, it may look complicated @ first glance but is really a map to the entire schema (save the templates table). (btw, I'm not saying that because I have any doubt that you can understand it, I'm sure you can, but I've known many programmers who ignore more complex DB structures to their own detriment because it LOOKS overly complex but when analyzed is actually less complex than the acrobatics they have to do in their program to get similar results... Relational DBs are based on a branch of mathematics that is good @ accurately, consistently, & (relatively) succinctly, associating data).

General Use: (for more details read the comments in the SQL CREATE TABLE statements) -Populate the DaysOfWeek table. -Populate the TimeFrames table with some time frames you want to track (an AM timeframe might have a StartTime of 00:00:00 & an end time of 11:59:59 while PM might have StartTime of 12:00:00 & EndTime of 23:59:59) -Add Users -Add Dates to be tracked (see notes in SQL for thoughts on avoiding bloat & also the virtues of this table) -Populate the Templates table for each user -Generate the list of default Availabilities (with their associated AvailableTimes data) for each user -Expose the default Availabilities to the users so they can override the defaults NOTE: you can also add an optional table for Engagements to be the opposite of Availabilities (or maybe there is a better abstraction that would include both concepts...) Disclaimer: I did not take the additional time to fully populate my local DB & verify everything so there may be some weaknesses/errors I did not see in my diagrams... (sorry I spent far longer than intended on this & must get work done on an overdue project). While I have worked fairly extensively with DB structures & with DBs others have created for 12+ years I'm sure I am not without fault, I hope others on StackOverflow will round out mistakes I may have included.

I apologize for not including more example data. If I have time in the near future I will provide some, (think adding George, Fred, & Harry to the users table, adding some dates to the Dates table then detailing how busy George & Fred are compared to Harry during their school week using the Availabilities, AvailableTimes & TimeFrames tables).

The SELECT statement (NOTE: I would highly recommend making this into a view... in that way you can select whatever columns you want & add whatever arguments/conditions you want in a WHERE clause without having to write the joins out every time... so the view would NOT include the WHERE clause... just to make that clear):

SELECT *
FROM Users Us
JOIN Availabilities Av
ON Us.User_ID=Av.User_ID
JOIN Dates Da
ON Av.Date_ID=Da.Date_ID
JOIN AvailableTimes Avt
ON Av.Av_ID=Avt.Av_ID
WHERE Da.Date='2014-01-03' -- whatever date 
-- alternately: WHERE Da.DayOWeek_ID=3 -- which would be Wednesday
-- WHERE Da.Date BETWEEN() -- whatever date range...
-- etc...

Recommended data in DaysOfWeek (which is effectively a lookup table):

INSERT INTO DaysOfWeek(DayOWeek_ID,Name,Description)
VALUES (1,'Sunday', 'First Day of the Week'),(1,'Monday', 'Second Day of the Week')...(7,'Saturday', 'Last Day of the Week'),(8,'AllWeek','The entire week'),(9,'Weekdays', 'Monday through Friday'),(10,'Weekends','Saturday & Sunday')

Example Templates data:

INSERT INTO Templates(Time_ID,User_ID,DayOWeek_ID)
VALUES (1,1,9)-- this would show the first user is available for the first time frame every weekday as their default... 
,(1,2,2) -- this would show the first user available on Tuesdays for the second time frame

The following is the recommended schema structure:

CREATE  TABLE `test`.`Users` (

User_ID INT NOT NULL AUTO_INCREMENT , UserName VARCHAR(45) NULL , PRIMARY KEY (User_ID) );

CREATE  TABLE `test`.`Templates` (
  `Template_ID` INT NOT NULL AUTO_INCREMENT ,
  `Time_ID` INT NULL ,
  `User_ID` INT NULL ,
  `DayOWeek_ID` INT NULL ,
  PRIMARY KEY (`Template_ID`) )
`COMMENT = 'This table holds the template data for general expected availability of a user/agent/person (so the person would use this to set their general availability)'`;

CREATE  TABLE `test`.`Availabilities` (
  `Av_ID` INT NOT NULL AUTO_INCREMENT ,
  `User_ID` INT NULL ,
  `Date_ID` INT NULL ,
  PRIMARY KEY (`Av_ID`) )
COMMENT = 'This table holds a users actual availability for a particular date.\nIf the use is not available for a date then this table has no entry for that user for that date.\n(btw, this suggests the possiblity of an alternate table that could utilize all other structures except the templates called Engagements which would record when a user is actually busy... in order to use this table & the other table together would need to always join to AvailableTimes as a date would actually be in both tables but associated with different time frames).';

CREATE  TABLE `test`.`Dates` (
  `Date_ID` INT NOT NULL AUTO_INCREMENT ,
  `DayOWeek_ID` INT NULL ,
  `Date` DATE NULL ,
  PRIMARY KEY (`Date_ID`) )
COMMENT = 'This table is utilized to hold actual dates whith which users/agents can be associated.\nThe important thing to note here is: this may end up holding every day of every year... this suggests a need to archive this data (and everything associated with it for performance reasons as this database is utilized).\nOne more important detail... this is more efficient than associating actual dates directly with each user/agent with an availability on that date... this way the date is only recorded once, the other approach records this date with the user for each availability.';

 CREATE  TABLE `test`.`AvailableTimes` (
  `AvTime_ID` INT NOT NULL AUTO_INCREMENT ,
  `Av_ID` INT NULL ,
  `Time_ID` INT NULL ,
  PRIMARY KEY (`AvTime_ID`) )
COMMENT = 'This table records the time frames that a user is available on a particular date.\nThis allows the time frames to be flexible without affecting the structure of the DB.\n(e.g. if you only keep track of AM & PM at the beginning of the use of the DB but later decide to keep track on an hourly basis you simply add the hourly time frames & start populating them, no changes to the DB schema need to be made)';

CREATE  TABLE `test`.`TimeFrames` (
  `Time_ID` INT NOT NULL AUTO_INCREMENT ,
  `StartTime` TIME NOT NULL ,
  `EndTime` TIME NOT NULL ,
  `Name` VARCHAR(45) NOT NULL ,
  `Desc` VARCHAR(128) NULL ,
  PRIMARY KEY (`Time_ID`) ,
  UNIQUE INDEX `Name_UNIQUE` (`Name` ASC) )
COMMENT = 'Utilize this table to record the times that are being tracked.\nThis allows the flexibility of having multiple time frames on the same day.\nIt also provides the flexibility to change the time frames being tracked without changing the DB structure.';

CREATE  TABLE `test`.`DaysOfWeek` (
  `DaysOWeek_ID` INT NOT NULL AUTO_INCREMENT ,
  `Name` VARCHAR(45) NOT NULL ,
  `Description` VARCHAR(128) NULL ,
  PRIMARY KEY (`DaysOWeek_ID`) ,
  UNIQUE INDEX `Name_UNIQUE` (`Name` ASC) )
COMMENT = 'This table is a lookup table to hold the days of the week.\nI personally would recommend adding a row for:\nWeekends, All Week, & WeekDays \nThis will often be used in conjunction with the templates and will allow less entries in that table to be made with those 3 entries in this table.';
like image 99
MER Avatar answered Oct 15 '22 02:10

MER