From a systems design/scalability perspective, what are some industry-standard strategies in dealing with a system that requires heavy writes to a particular table in a DB.
For simplicity sake, let's say the table is an inventory table for products, and has a column 'Product Name', and a column 'Count', and it simply increments by +1 each time a new Product is bought into the system. And there are millions of users buying different products every 2nd and we have to keep track of the latest count of each product, but it does not have to be strictly realtime, maybe a 5 min lag is acceptable.
My options are:
1) Master slave replication, where master DB handles all writes, and slaves handles reads. But this doesn't address the write-heavy problem
2) Sharding the DB based on product name range, or its hashed value. But what if there's a specific product (eg Apple) that receives large number of updates in a short time, it'll still hit the same DB.
3) Batched updates? Use some kind of caching and write to table every X number of seconds with a cumulative counts of whatever we've received in those X seconds? Is that a valid option, and what caching mechanism do I use? And what if there's a crash between the last read and next write? How do I recover the lost count?
4) Any other obvious choices I forgot about?
Any insight is appreciated!
I’d say a solution will be highly dependent of what exactly you need to do. A solution to write thousands of records per second might be very different from incrementing a counter in the example you provided. More so, there could be no tables
at all to handle such load. Consistency
/availability
requirements are also missing in your question and depending on them the entire architecture may be very different.
Anyway, back to your specific simplistic case and your options
The problem you’ll face here is database locking
- every increment would require a record lock to avoid race conditions and you’ll quickly get your processes writing to your db waiting in a queue and your system down. Even under a moderate load )
Your assumption is correct, not much different from p.1
Very close. A caching layer provided by a light-weight storage providing concurrent atomic incremens/decrements with persistence not to lose your data. We’ve used redis
for a similar purpose although any other key-value database would do as well - there are literally dozens of such databases around.
A key-value database, or key-value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash table
The solution would look as follows:
incoming requests → your backend server -> kv_storage (atomic increment(product_id))
And you'll have a "flushing" script running i.e. */5
that does the following (simplified):
product_id
in kv_storage read its current value
+= value
)value
in kv_storage You asked a typical CQRS question. "CQRS" stands for Command Query Responsibility Segregation. It is what it sounds like - you are separating your writes (commands) from your reads (queries). This approach solves problems when you have different needs between the writes and reads - exactly your situation.
To achieve this in a scalable fashion, you need to acknowledge (i.e., accept) a request to increment, and queue it for processing. And let the reads work real-time per request. Process the queued requests with a background command handler which knows how to reconcile. i.e., if it fails, it should know how to resolve a conflict (e.g., if somebody else updated the row, retrieve a newer version and try again).
I completely disagree with another answer where somebody suggested that queuing will bring down your entire system. Queueing does not bring anything down, because it's queuing and not real-time processing. That's the point of scaling. It's the opposite - making a change real-time, even if this means just to change a boolean flag in an in-memory cache, is much worse than queuing. Just think what will happen if the in-memory cache is down at that exact moment. Asynchronous offline (background) processing ensures that such problems don't prevent the command to be eventually processed. However, you may need to either process the queued commands slowly (whatever pace it can handle without affecting reads) or in a separate copy of the data.
You could use a specific technology like in-memory cache as others suggested, but that again is yet another implementation of CQRS paradigm. It could be a cache or just another copy of the record or a database. Same thing and same effect.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With