Background / Application
I have a MySQL database containing a table of rentable properties and a table of bookings for these properties. There is also a search feature for finding available properties between two provided dates. When searching, the user can enter the start date, the amount of days they wish to stay, and a date flexibility of up to +/- 7 days. A booking can start on the same day as another booking ends (party 1 leaves in the morning, party 2 arrives in the evening).
I am having difficulty implementing the flexibility feature efficiently.
Schema
CREATE TABLE IF NOT EXISTS `property` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
CREATE TABLE IF NOT EXISTS `property_booking` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`property_id` bigint(20) DEFAULT NULL,
`name` varchar(60) COLLATE utf8_unicode_ci DEFAULT NULL,
`date_start` date DEFAULT NULL,
`date_end` date DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Sample Data
INSERT INTO `property` (`name`)
VALUES ('Property 1'), ('Property 2'), ('Property 3');
INSERT INTO `property_booking` (`property_id`,`name`,`date_start`,`date_end`)
VALUES (1, 'Steve', '2011-03-01', '2011-03-08'),
(2, 'Bob', '2011-03-13', '2011-03-20'),
(3, 'Jim', '2011-03-16', '2011-03-23');
Sample Scenario
The user selects that they want to start their stay on 2011-03-10, they want to stay for 7 days, and they have a flexibility of +/- 2 days. I have compiled an image that visualises the data and parameters below. (Red: Booking 1, Green: Booking 2, Stripes: Booking 3, Blue: Date range (2011-03-10, + 7 days and +/- 2 days flexibility))
Expected Result
Property 1 (Bookings available throughout date range)
Property 3 (Bookings available starting on 2011-03-08 or 2011-03-09)
Current Method
My current query checks for overlap for all 7 day date ranges within the total searchable date range, like this:
SELECT p.`id`, p.`name`
FROM `property` p
WHERE (NOT (EXISTS (SELECT p2.`name` FROM `property_booking` p2 WHERE (p2.`property_id` = p.`id` AND '2011-03-10' < DATE_SUB(p2.`date_end`, INTERVAL 1 DAY) AND '2011-03-17' > DATE_ADD(p2.`date_start`, INTERVAL 1 DAY)))))
OR (NOT (EXISTS (SELECT p3.`name` FROM `property_booking` p3 WHERE (p3.`property_id` = p.`id` AND '2011-03-11' < DATE_SUB(p3.`date_end`, INTERVAL 1 DAY) AND '2011-03-18' > DATE_ADD(p3.`date_start`, INTERVAL 1 DAY)))))
OR (NOT (EXISTS (SELECT p4.`name` FROM `property_booking` p4 WHERE (p4.`property_id` = p.`id` AND '2011-03-09' < DATE_SUB(p4.`date_end`, INTERVAL 1 DAY) AND '2011-03-16' > DATE_ADD(p4.`date_start`, INTERVAL 1 DAY)))))
OR (NOT (EXISTS (SELECT p5.`name` FROM `property_booking` p5 WHERE (p5.`property_id` = p.`id` AND '2011-03-12' < DATE_SUB(p5.`date_end`, INTERVAL 1 DAY) AND '2011-03-19' > DATE_ADD(p5.`date_start`, INTERVAL 1 DAY)))))
OR (NOT (EXISTS (SELECT p6.`name` FROM `property_booking` p6 WHERE (p6.`property_id` = p.`id` AND '2011-03-08' < DATE_SUB(p6.`date_end`, INTERVAL 1 DAY) AND '2011-03-15' > DATE_ADD(p6.`date_start`, INTERVAL 1 DAY)))));
On the sample dataset, it's reasonably quick, but on much larger datasets it's going to get pretty sluggish, even more so when you build the full +/- 7 day flexibility.
Does anyone have any suggestions as to how this query could be better written?
Ok, here's a tricky answer for a tricky question...
SELECT * FROM property AS p
LEFT JOIN
(
SELECT property_id, DATEDIFF(MAX(date_end),20110308) AS startblock,
DATEDIFF(20110319,MIN(date_start))-1 AS endblock
FROM property_booking AS pb
WHERE date_start < 20110319 || date_end >= 20110308
GROUP BY property_id
HAVING LEAST(startblock,endblock) > 4
) AS p2 ON p.id = p2.property_id
WHERE p2.property_id IS NULL;
The subquery selects all the properties that are not eligible. The LEFT JOIN with IS NULL basically works out the exclusion (negation on the ineligible properties)
HAVING LEAST(startblock,endblock) > 4
in twice your +/- number (2*2)It took me a while to work it out (but your question was interesting and I had time on my hand)
I've tested it with edge cases and it worked for all the test cases I've thrown at it...). The logic behind it is a bit odd but a good old pen and paper helped me work it out!
Edit
Unfortunately I realized that this will work for most cases but not all... (2 single day bookings at the very beginning and end of the lookup period makes a property unavailable even though it should be available).
The problem here is that you have to look up information that's not 'present' in the DB and reconstruct it from what data you have. Check out my comment on your question to see a better way to deal with the problem
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With