System design: Strategies for dealing with heavy writes to a DB

Question

From a systems design/scalability perspective, what are some industry-standard strategies in dealing with a system that requires heavy writes to a particular table in a DB.

For simplicity sake, let's say the table is an inventory table for products, and has a column 'Product Name', and a column 'Count', and it simply increments by +1 each time a new Product is bought into the system. And there are millions of users buying different products every 2nd and we have to keep track of the latest count of each product, but it does not have to be strictly realtime, maybe a 5 min lag is acceptable.

My options are:

1) Master slave replication, where master DB handles all writes, and slaves handles reads. But this doesn't address the write-heavy problem

2) Sharding the DB based on product name range, or its hashed value. But what if there's a specific product (eg Apple) that receives large number of updates in a short time, it'll still hit the same DB.

3) Batched updates? Use some kind of caching and write to table every X number of seconds with a cumulative counts of whatever we've received in those X seconds? Is that a valid option, and what caching mechanism do I use? And what if there's a crash between the last read and next write? How do I recover the lost count?

4) Any other obvious choices I forgot about?

Any insight is appreciated!

ffeast · Accepted Answer

I’d say a solution will be highly dependent of what exactly you need to do. A solution to write thousands of records per second might be very different from incrementing a counter in the example you provided. More so, there could be no tables at all to handle such load. Consistency/availability requirements are also missing in your question and depending on them the entire architecture may be very different.

Anyway, back to your specific simplistic case and your options

Option 1 (Master slave replication)

The problem you’ll face here is database locking - every increment would require a record lock to avoid race conditions and you’ll quickly get your processes writing to your db waiting in a queue and your system down. Even under a moderate load )

Option 2 (Sharding the DB)

Your assumption is correct, not much different from p.1

Option 3 (Batched updates)

Very close. A caching layer provided by a light-weight storage providing concurrent atomic incremens/decrements with persistence not to lose your data. We’ve used redis for a similar purpose although any other key-value database would do as well - there are literally dozens of such databases around.

A key-value database, or key-value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash table

The solution would look as follows:

incoming requests → your backend server -> kv_storage (atomic increment(product_id))

And you'll have a "flushing" script running i.e. */5 that does the following (simplified):

for every product_id in kv_storage read its current value
update your db counter (+= value)
decrement the value in kv_storage

Further scaling

if the script fails nothing bad would happen - the updates would arrive on next run
if your backend boxes can't handle load - you can easily add more boxes
if a single key-value db can't handle load - most of them support scaling over multiple boxes or a simple sharding strategy in your backend scripts would work fine
if a single "flushing" script doesn't keep up with increments - you can scale them to multiple boxes and decide what key ranges are handled by each one

Tengiz · Answer

You asked a typical CQRS question. "CQRS" stands for Command Query Responsibility Segregation. It is what it sounds like - you are separating your writes (commands) from your reads (queries). This approach solves problems when you have different needs between the writes and reads - exactly your situation.

To achieve this in a scalable fashion, you need to acknowledge (i.e., accept) a request to increment, and queue it for processing. And let the reads work real-time per request. Process the queued requests with a background command handler which knows how to reconcile. i.e., if it fails, it should know how to resolve a conflict (e.g., if somebody else updated the row, retrieve a newer version and try again).

I completely disagree with another answer where somebody suggested that queuing will bring down your entire system. Queueing does not bring anything down, because it's queuing and not real-time processing. That's the point of scaling. It's the opposite - making a change real-time, even if this means just to change a boolean flag in an in-memory cache, is much worse than queuing. Just think what will happen if the in-memory cache is down at that exact moment. Asynchronous offline (background) processing ensures that such problems don't prevent the command to be eventually processed. However, you may need to either process the queued commands slowly (whatever pace it can handle without affecting reads) or in a separate copy of the data.

You could use a specific technology like in-memory cache as others suggested, but that again is yet another implementation of CQRS paradigm. It could be a cache or just another copy of the record or a database. Same thing and same effect.

System design: Strategies for dealing with heavy writes to a DB

Tags:

architecture

scalability

system-design

user1008636

2 Answers

Option 1 (Master slave replication)

Option 2 (Sharding the DB)

Option 3 (Batched updates)

Further scaling

ffeast

Tengiz

Recent Activity

Donate For Us

System design: Strategies for dealing with heavy writes to a DB

Tags:

architecture

scalability

system-design

user1008636

2 Answers

Option 1 (Master slave replication)

Option 2 (Sharding the DB)

Option 3 (Batched updates)

Further scaling

ffeast

Tengiz

Related questions

Recent Activity

Donate For Us