I'm developing a lawyer booking system, where a person can book an appointment at a given time in a given day (the next lawyer's available day).
Let's say it is a ZocDoc for lawyers. The same structure, with appointments based on time: http://goo.gl/djUZb
I'm using MySQL and PHP.
CREATE TABLE `laywer_appointments` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`lawyer_id` INT unsigned,
`day_of_week` tinyint(3) unsigned DEFAULT '1',
`slot_date` date DEFAULT NULL,
`slot_time` time DEFAULT NULL,
`status` tinyint(4) NOT NULL DEFAULT '0',
`client_id` int(11) DEFAULT NULL, -- client_id = NULL means free slot
);
Each lawyer has default time slots based on the day of week (status = 0 means available). When inserting default slots, I don't provide a date, just day_of_week. Example data:
+-----------+-------------+-----------+-----------+
| lawyer_id | day_of_week | slot_time | status |
+-----------+-------------+-----------+-----------+
| 1 | 1 | 08:00 | 0 |
| 1 | 1 | 08:30 | 0 |
| 1 | 1 | 09:00 | 0 |
| 1 | 1 | 10:30 | 0 |
| 1 | 4 | 14:30 | 0 |
| 1 | 4 | 16:40 | 0 |
| 2 | 1 | 10:20 | 0 |
| 2 | 1 | 14:00 | 0 |
| 2 | 3 | 15:50 | 0 |
+-----------+-------------+-----------+-----------+
A lawyer can add a time slot to a specific day (even if this day is from a different day of week from his default slots) and can also lock (status = -1) one of the default slots in a specific day (i.e. he is on a meeting or he is sick):
+-----------+-------------+-----------+-----------+-----------+
| lawyer_id | day_of_week | slot_time | slot_date | status |
+-----------+-------------+-----------+-----------+-----------+
| 1 | 1 | 16:00 | 12/03/13 | 0 |
| 1 | 6 | 11:00 | 26/04/13 | 0 |
| 1 | 6 | 12:00 | 26/04/13 | 0 |
| 2 | 1 | 10:00 | 01/01/13 | -1 |
+-----------+-------------+-----------+-----------+-----------+
Then we have appointments booked. In this case we fill the slot_date and the client_id:
+-----------+-------------+-----------+-----------+-----------+
| lawyer_id | day_of_week | slot_time | slot_date | client_id |
+-----------+-------------+-----------+-----------+-----------+
| 1 | 1 | 10:30 | 12/03/13 | 10 |
+-----------+-------------+-----------+-----------+-----------+
As an example, with the above booking and assuming it is still 6:30 of the same day (12/03/13), the free available slots that have to be printed are:
8:00 - default slot
8:30 - default slot
9:00 - default slot
16:00 - Specific slot inserted in point 2 for 12/03/13
I have to return the next available date and the related free times (default ones, specific ones minus locked ones and booked ones). I can't just say "return times from Monday, 10/10/13".
In a search results page, I'll list all lawyers and the availability time table for each. So that means each lawyer will have a different time table every time a search is made.
I can't simply say "SELECT time FROM [bunch of joins] WHERE date = today".
I came with this query which ignores slots that are locked (status = -1) or booked (client_id not null), but of course it won't return the free times for the closest day with available times (or from today):
SELECT p.day_of_week, p.slot_date, p.slot_time
FROM laywer_appointments p
WHERE p.client_id IS NULL AND p.status = 0
AND p.slot_time NOT IN (
SELECT s.slot_time FROM laywer_appointments s
WHERE (s.slot_date IS NOT NULL AND s.client_id IS NOT NULL
OR s.status = -1) AND s.day_of_week = p.day_of_week
)
GROUP BY p.day_of_week, p.slot_date, p.slot_time
ORDER BY p.day_of_week ASC, p.slot_time ASC;
Another problem: if today is day_of_week = 5, but the next available day_of_week for a given lawyer is 2, how can I query that?
How to return the next closest and available day_of_week and aggregate to just return times from this day, not all days?
One thing I came with was to create 3 tables instead of one:
Then I'll store ALL free time slots for every day of the actual date up to an year in the slots table for every lawyer. (taken time slots from default_slots).
+-----------+-------------+-----------+-----------+-----------+
| lawyer_id | day_of_week | slot_time | slot_date | status |
+-----------+-------------+-----------+-----------+-----------+
| 1 | 1 | 16:00 | 12/03/13 | 0 |
| 1 | 1 | 16:00 | 12/03/13 | 0 |
| 1 | 2 | 08:00 | 13/03/13 | 0 |
| 1 | 2 | 09:00 | 13/03/13 | 0 |
... next week
| 1 | 1 | 16:00 | 19/03/13 | 0 |
| 1 | 1 | 16:00 | 19/03/13 | 0 |
| 1 | 2 | 08:00 | 20/03/13 | 0 |
| 1 | 2 | 09:00 | 20/03/13 | 0 |
... up to an year
| 1 | 1 | 16:00 | 20/03/14 | 0 |
| 1 | 1 | 16:00 | 20/03/14 | 0 |
| 1 | 2 | 08:00 | 21/03/14 | 0 |
| 1 | 2 | 09:00 | 21/03/14 | 0 |
+-----------+-------------+-----------+-----------+-----------+
I'll also have some cron jobs that run every week that adds another week of free slot records in the table slots and also remove past records to reduce table size and unused data.
A lawyer will also be able to lock a time directly into slots, as well add specific times (point 2).
For the listing, it will be a matter of getting the slots for a date equal or greater than today with free times, since every time of every date will have a row.
Implications on this solution: 1) Day one we will have 2500 lawyers (2nd month around 6000). Assuming 8 possible slots/daily X 20 days of work/month X 12 months = 1920 slot records per lawyer.
2500 laywers x 1920 records = 4,8 million records on day one. (~12M the second month)
Those records will be UPDATED, INSERTED and DELETED all the time. The slots table has some indexes, so I can't imagine write operations being made constantly on a table with 12M+ records and some indexes. Indexes being updated every second doesn't see smart to me.
I really can't come with a reasonable and scalable solution. My solution with one table only could work, but I can't think in a way of querying that at all. And the denormalized slots table will be huge, while needing constant write operations.
Any tips?
I have done something similar to what your trying to do so I understand how complicated it is :)
This was done in MSSQL so you will need to convert it to MySql.
These are the tables we ended up with:
TimeSlots:
We store both the default timesslots & modified timeslots for each staff member in this table (We have a column called "SlotType" in this table. SlotType 1 = DEFAULT TIMESLOTS & SlotType 2 = MODIFIED TIMESLOTS). If you look at "Tue 30/04/13" in the picture above you will see that we modified the timeslots for that day to only display a 9am appointment for this particular staff member.
ClosedDays:
This is a list of closed days - for example a staff member my not work on his birthday & Christmas day.
Appointments:
This is a list of appointments that have been booked (or waiting for booking confirmation).
SQL Query to get available appointments:
To check to appointments we then used the following SQL in our stored procedure. It checks one staff members appointments for the date specified. The final stored procedure that we are using loops though each staff member on the page for each day of the week to get all the appointments. Using this query to get 10 staff members appointments for the next 7 days = a total of 70 query's & takes about 300ms with a million records in each table. We are loading the appointments via ajax so 300ms is acceptable for our use & will prob change it to get each staff members appointments separately via ajax (so 7 query's at a time) to improve performance even more in the future.
DECLARE @MyDate date, @MyDayName nvarchar(10); IF @StartDate IS NULL SET @StartDate = GETDATE(); SET @MyDate = CAST(@StartDate AS date); SET @MyDayName = DATENAME(dw, @MyDate ); --NOTES: --@SlotType = 1 (DEFAULT TIMESLOTS), 2 (MODIFIED TIMESLOTS) --***CHECK TO SEE IF DOCTOR IS CLOSED TODAY*** IF NOT EXISTS (SELECT [ClosedDays].[ID] FROM [ClosedDays] WHERE [ClosedDays].[StaffID] = @StaffID AND [ClosedDays].[BusinessID] = @BusinessID AND [ClosedDays].[Active] = 1 AND @MyDate BETWEEN [ClosedDays].[StartDate] AND [ClosedDays].[EndDate]) BEGIN --***THE DOCTOR IS NOT CLOSED TODAY SO GET THE AVAILABLE TIMESLOTS*** --***CHECK TO SEE IF DOCTOR IS HAS MODIED TIMESLOTS TODAY*** IF NOT EXISTS (SELECT [TimeSlots].[ID], @MyDate AS SlotDate FROM [TimeSlots] WHERE [TimeSlots].[StaffID] = @StaffID AND [TimeSlots].[BusinessID] = @BusinessID AND [TimeSlots].[Active] = 1 AND [TimeSlots].[SlotType] = 2 AND [TimeSlots].[SlotDay] = @MyDayName AND @MyDate BETWEEN [TimeSlots].[StartDate] AND [TimeSlots].[EndDate] AND [TimeSlots].[ID] NOT IN (SELECT [Appointments].[TimeSlotID] FROM [Appointments]) ) BEGIN --***THE DOCTOR HAS NO MODIFIED TIMESLOTS FOR TODAY USE THE DEFAULT ONES*** SELECT [TimeSlots].[ID] AS SlotID, [TimeSlots].[StaffID], [TimeSlots].[BusinessID], CONVERT(nvarchar(10), @MyDate, 103) AS SlotDate, [TimeSlots].[SlotDay], LTRIM(RIGHT(CONVERT(nvarchar(10), [TimeSlots].[SlotTime], 100), 7))AS SlotTime FROM [TimeSlots] WHERE [TimeSlots].[StaffID] = @StaffID AND [TimeSlots].[BusinessID] = @BusinessID AND [TimeSlots].[Active] = 1 AND [TimeSlots].[SlotType] = 1 AND [TimeSlots].[SlotDay] = @MyDayName AND @MyDate BETWEEN [TimeSlots].[StartDate] AND [TimeSlots].[EndDate] AND NOT EXISTS (SELECT [Appointments].[TimeSlotID] FROM [Appointments] WHERE [Appointments].[TimeSlotID] = [TimeSlots].[ID]) END ELSE BEGIN --***THE DOCTOR HAS MODIFIED TODAYS TIMESLOTS SO USE THE MODIFIED TIMESLOTS*** SELECT [TimeSlots].[ID] AS SlotID, [TimeSlots].[StaffID], [TimeSlots].[BusinessID], CONVERT(nvarchar(10), @MyDate, 103) AS SlotDate, [TimeSlots].[SlotDay], LTRIM(RIGHT(CONVERT(nvarchar(10), [TimeSlots].[SlotTime], 100), 7))AS SlotTime FROM [TimeSlots] WHERE [TimeSlots].[StaffID] = @StaffID AND [TimeSlots].[BusinessID] = @BusinessID AND [TimeSlots].[Active] = 1 AND [TimeSlots].[SlotType] = 2 AND [TimeSlots].[SlotDay] = @MyDayName AND @MyDate BETWEEN [TimeSlots].[StartDate] AND [TimeSlots].[EndDate] AND NOT EXISTS (SELECT [Appointments].[TimeSlotID] FROM [Appointments] WHERE [Appointments].[TimeSlotID] = [TimeSlots].[ID]) END END ELSE BEGIN --***NO APPOINTMENTS WERE FOUND*** --***DUMMY QUERY TO RETURN NO RECORDS*** SELECT [TimeSlots].[ID] AS SlotID, [TimeSlots].[StaffID], [TimeSlots].[BusinessID], CONVERT(nvarchar(10), @MyDate, 103) AS SlotDate, [TimeSlots].[SlotDay], LTRIM(RIGHT(CONVERT(nvarchar(10), [TimeSlots].[SlotTime], 100), 7))AS SlotTime FROM [TimeSlots] WHERE [TimeSlots].[ID] = -0 END
Hope this makes sense & if anyone else has some idea's on how to optimize this more please let me know!
You're right that you'll have a big table. But it's not clear that your application will fail as a result. MySQL (and all DBMS software) is made to allow the accessing of large tables quickly.
Good dedicated MySQL server hardware (which has a 64-bit OS, two or four fast processors, plenty of RAM, and excellent file I/O -- SAS-interfaced fast disks) and properly configured server software will handle this workload.
You may wish to merge slot_time and slot_date into a single DATETIME or TIMESTAMP field, that can be indexed for ease of searching. If you choose to use TIMESTAMP data items you'll get some nice timezone-handling benefits if you do things right.
You may want to work out how to partition your big table using a scheme that lets you take a month's worth, or even a week's worth, of data offline when that month or week is past.
With 2,500 lawyers using your system, you're going to want to get this right. Why not spend some money on a decent database administrator? They cost less per hour than most lawyers. Sheeri Cabral wrote up a good summary of how to find one. http://www.sheeri.org/how-to-find-a-dba/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With