I am facing a situation of designing a booking system of a hotel website that may handle thousand and thousands of hotel rooms, and their availability. Something like the Expedia website.
This availability would need to be stored locally, and I would need to query these hotel rooms for availability.
So far I have arrived in the following structure:
The matrix would look like the following:
Where 1 means that room is available on that date, and 0 means it is booked. So for me to detect if a room_type is available in a date range, I have to:
In the above table, suppose the date range was 10/10/2017 to 10/14/2017 (US format), the only available room would be room 101 at hotel C.
I can then book a Customer to that room, so the customer won't need to change rooms during his/hers stay.
My questions are:
I am worried about not picking the correct structure now, and stumble into performance problems later in the future when the website is under high demand and with an even higher volume of rooms.
Thank you.
First of all, thanks for the question. It is one of the complex problems that I know, considering the variety of use-cases that will come on top of just the basic hotel availability. Some examples: how do you do geo-searches, how to support multiple check-ins in the same day for the same room, how to keep the data highly consistent with the world-view (the up-to-date price and availability), how do you rank hotels based on their price, how to add more attributes like custom benefits (smoke-free, 24-hour checkin, kitchen included, etc.), how to handle different cancellation policy, etc.
You get the point, I believe.
The best designs I have encountered use column or schema-less stores (like cassandra, mongo), along with a RDBMS.
NoSQL stores give flexibility to keep adding more attributes and also to search across multiple dimensions like city, nearby search, etc.
RDBMS can store rarely changing data like different room types, hotel description, etc.
For example, for the simple use-case above, I would store all the different room types, hotel to room mappings in a RDBMS. But the hotel price and availability would go to a NoSQL store. I would have room prices and availability stored in a NoSQL store with search attributes like check-in/check-out date, number of people, number of rooms as the keys to this table. I would have a cron job running periodically to populate all the different search permutations for all the rooms available (by cross lookups on another booking table) and update the NoSQL store. This way the read-path is faster (this is very similar to fan-out on write approach that many social media websites design their architecture. Optimise the read path and propagate updates on the write-path, fully aware that read-data can be a little stale). In our case, the price and availability might be stale for the time period between the cron job runs, but this can be easily offset by making another check with a consistent store (say a booking table), just before the time of booking. If there are differences, you can start the workflow again.
Note: the above is a highly subjective opinion and based on the assumption that prices and availability change infrequently. You could use a combination of streaming systems like kafka to make the updates faster
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With