Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Database for long running transactions with huge updates

I build a tool for data extraction and transformation. Typical use case - transactionally processing lots of data.

Numbers are - about 10sec - 5min duration, 200-10000 row updated (long duration caused not by the database itself but by outside services that used during transaction).

There are two types of agents that access database - multiple read agents, and only one write agent (so, there are never multiple concurrent write).

During the transaction:

  • Read agents should be able to read database and see it in the current state.
  • Write agent should be able to read database (it does both - read and write during transaction) and see it in the new (not yet committed) state.

Is PostgreSQL a good choice for that type of load? I know it uses MVCC - so it should be ok in general, but is it ok to use long and big transactions extensively?

What other open-source transactional databases may be a good choice (I am not limited to SQL)?

P.S.

I do not know if the sharding may affect the performance. The database will be sharded. For every shard there will be multiple readers and only one writer, but multiple different shards can be written to at the same time.

I know that it's better not to use outside services during transaction, but in that case - it's the goal. The database used as a reliable and consistent index for some heavy, huge, slow and eventually-consistent data processing tool.

like image 953
Alex Craft Avatar asked Dec 10 '25 20:12

Alex Craft


2 Answers

Huge disclaimer: as always, only real life test can tell you the truth.

But, I think PostgreSQL will not let you down, if you use most recent version (at least 9.1, better 9.2) and tune it properly.

I have somewhat similar load in my server, but with slightly worse R/W ratio: about 10:1. Transactions range from few milliseconds up to 1 hour (and sometimes even more), and one transaction can insert or update up to 100k rows. Total number of concurrent writers with long transactions can reach 10 and more. So far so good - I don't really have any serious issues, performance is great (certainly not worse than I expected).

What really helps is that my hot working data set almost fits into available memory.

So, give it a try, it should work great for your load.

like image 88
mvp Avatar answered Dec 13 '25 08:12

mvp


Have a look at this link. Maximum transaction size in PostgreSQL

Basically there can be some technical limits on the software side to how large your transaction can be.

like image 23
Kuberchaun Avatar answered Dec 13 '25 08:12

Kuberchaun



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!